Lecture Notes in 
Computer Science 1538 



Jieh Hsiang AtsushiOhori (Eds.) 



Advances in 
Computing Science - 
ASIAN’98 

4th Asian Computing Science Conference 
Manila, The Philippines, December 1998 
Proceedings 







Springer 




Lecture Notes in Computer Science 1538 

Edited by G. Goes, J. Hartmanis and J. van Leeuwen 




Springer 

Berlin 

Heidelberg 

New York 

Barcelona 

Hong Kong 

London 

Milan 

Paris 

Singapore 

Tokyo 




Jieh Hsiang Atsushi Ohori (Eds.) 



Advances in 
Computing Science 



ASIAN 98 



4th Asian Computing Science Conference 
Manila, The Philippines, December 8-10, 1998 
Proceedings 




Springer 




Series Editors 



Gerhard Goos, Karlsruhe University, Germany 
Juris Hartmanis, Cornell University, NY, USA 
Jan van Leeuwen, Utrecht University, The Netherlands 



Volume Editors 
Jieh Hsiang 

National Chi-Nan University, College of Science and Technology 

Puli, Nantou, Taiwan 

E-mail: hsiang@csie.ntu.edu. tw 

Atsushi Ohori 
Kyoto University 

Research Institute for Mathematical Sciences 
Sakyo-ku, Kyoto 606-8502, Japan 
E-mail: ohori @ kurims .ky oto-u .ac.jp 



Cataloging-in-Publication data applied for 
Die Deutsche Bibliothek - CIP-Einheitsaufnahme 

Advances in computing science : proceedings / ASIAN 98, 4th Asian Computing 
Science Conference, Manila, The Philippines, December 8 - 10, 1998. Jieh 
Hsiang ; Atsushi Ohori (ed.). - Berlin ; Heidelberg ; New York ; Barcelona ; 

Hong Kong ; London ; Milan ; Paris ; Singapore ; Tokyo : Springer, 1998 
(Lecture notes in computer science ; Vol. 1538) 

ISBN 3-540-65388-0 



CR Subject Class! cation (1998): F, 1.2.3, D.3 
ISSN 0302-9743 

ISBN 3-540-65388-0 Springer- Verlag Berlin Heidelberg New York 



This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, speci cally the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, 
reproduction on micro 1ms or in any other way, and storage in data banks. Duplication of this publication 
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, 
in its current version, and permission for use must always be obtained from Springer- Verlag. Violations are 
liable for prosecution under the German Copyright Law. 

(c) Springer-Verlag Berlin Heidelberg 1998 
Printed in Germany 

Typesetting: Camera-ready by author 

SPIN 10692998 06/3142 5 4 3 2 1 0 Printed on acid-free paper 




Preface 



This volume contains the proceedings of the Fourth Asian Computing Science 
Conference (ASIAN98), held December 8-10, 1998, in Manila, the Philippines. 
The previous three ASIAN conferences were also published as Lecture Notes in 
Computer Science, Volumes 1023 (Bangkok, 1995), 1179 (Singapore, 1996), and 
1345 (Kathmandu, 1997). 

Initiated in 1995 by the Asian Institute of Technology in partnership with 
INRIA and UNU, the ASIAN conference series aims at providing a forum in Asia 
for the exchange of the most recent research ideas and results in computer science 
and information technology. While each year features several emphasized themes, 
the 1998 conference focuses on the research areas of (1) formal reasoning and 
verification, (2) programming languages, (3) data and knowledge representation, 
and (4) networking and Web computing. 

There were 43 submissions to the conference, out of which 17 were chosen 
for presentation and inclusion in this proceedings. The papers were submitted 
from Australia, Brazil, China, France, Germany, India, Italy, Japan, Korea, New 
Zealand, the Philippines, Russia, Singapore, Spain, Switzerland, Taiwan, Thai- 
land, the United Kindom, and the United States of America. The program com- 
mittee meeting was held virtually over the Internet. The selection was finalized 
after a fifteen-day period of lively discussion. Each paper was carefully reviewed 
and received at least three reports. 

In addition to 17 highly selective papers, this year’s conference also features 
a keynote speech by Jeannette M. Wing (Carnegie Mellon University) on Formal 
Methods: Past, Present, and Future, two invited talks by Susumu Hayashi (Kobe 
University) on Testing Proofs by Examples and Claude Kirchner (INRIA) on 
The Rewriting Calculus as a Semantics of ELAN, and two tutorials by Tomasz 
Janowski (UNU/IIST) on Semantics and Logic for Provable Fault- Tolerance and 
R.K. Shyamasundar (TIFR) on Mobile Computation: Calculus and Languages . 

We would like to take this opportunity to thank our sponsors, the steering 
committee for the organization, the program committee and the reviewers for 
the excellent and conscientious job in evaluating the papers, the local committee 
for the local arrangements. Bill McCune and Claude Kirchner for generously 
providing the package for Web PC management and technical help in setting 
it up, and assistants Camel Dai and Takeomi Kato for managing the website. 
Without the collective work of everyone this conference would not have been 
possible. 
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Testing Proofs by Examples 



Susumu Hayashi^* and Ryosuke Sumitomo^ 

^ Department of Computer and Systems Engineering, 
Faculty of Engineering, Kobe University, 

1-1 Rokko-dai, Nada, Kobe, Japan, 
shayashi@kobe-u .ac.jp 
^ Graduate School of Science and Technology 
Kobe University, 1-1 Rokko-dai, Nada, Kobe, Japan, 
sumitomoOpascal . seg.kobe-u. ac . jp 



Abstract. We will present the project of proof animation, which started 
last April. The motivations, aims, and problems of the proof animation 
will be presented. We will also make a demo of ProofWorks, a small 
prototype tool for proof animation. 



Proof animation means to execute formal proofs to find incorrectness in them. 
A methodology of executing formal proofs as programs is known as “proofs as 
programs” or “constructive programming.” “Proofs as programs” is a means 
to exclude incorrectness from programs by the aid of formal proof checking. 
Although proof animation resembles proof as programs and in fact it is a con- 
trapositive of proofs as programs, it seems to provide an entirely new area of 
research of proof construction. 

In spite of wide suspicions and criticisms, formal proof developments are 
becoming a reality. We have already had some large formal proofs like Shanker’s 
proof of Godel’s incompleteness theorem and proof libraries of mathematics and 
computer science are being built by some teams aided by advanced proof checkers 
such as Coq, HOL, Mizar, etc. 

However, construction of big formal proofs is still very costly. The construc- 
tion of formal proofs are achieved only through dedicated labors by human be- 
ings. Formal proof developments are much more time-consuming and so costly 
activities than program developments. 

Why is it so? 

A reason would be lack of means of testing proofs. Testing programs by exam- 
ples is less reliable than verifying programs formally. It is practically impossible 
to exclude all bugs of complicated software only by testing. Verification is supe- 
rior to testing for achieving “pure-water correctness,” a correctness at the degree 
of purity of pure water. 

However, testing is much easier and more efficient to find 80% or 90% of bugs 
in programs. Since the majority of softwares need correctness only at the degree 

* Supported by No. 10480063, Monbusyo, Kaken-hi (the aid of Scientific Research, 
The Ministry of Education) 
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of purity of tap water, the most standard way of debugging is still testing rather 
than verification. Furthermore, the majority of people seem to find that testing 
programs is more enjoyable than verification. 

For software developments, we have two options. However, we have only 
one option for formal proof developments. Obviously checking formal proofs 
by formal inference rules corresponds to verification. (In fact, the activity of 
verifications is a “subset” of formal proofs by formal proof checking.) Thus, we 
may set an equation 

X testing programs 

formal checking of proofs formal verification of programs 

A solution X would be a means to find errors in formal proofs quickly and easily, 
although it cannot certify pure- water correctness. 

By Curry-Howard isomorphism, a mathematical theory bridging functional 
programs and proofs, the solution of this equation is 

X = testing proofs by execution of proofs, 

Since it resembles and shares aims with “animation of formal specifications” in 
formal methods, we call it proof animation. We often call it “testing of proofs” 
as well. 

A plausible reaction to proof animation may be as follows: 

How can bugs exist in formal proofs? Formally checked proofs must be 
correct by definition! 

Bugs can exist in completely formalized proofs, since correctness of formal- 
ization eannot not be formally certified. This is an issue noticed in the researches 
of formal methods, e.g. Viper processor verification project fP and, even earlier, 
by some logicians and philosophers, e.g., L. Wittgenstein and S. Kripke P]! 

We will discuss that this difficulty is a source of inefficiency of formal proof 
developments and proof animation may ease it. A proof animation tool Proof- 
Works, figure El is still under construction and case studies are yet to be per- 
formed. Nonetheless, the experiences with proof-program developments in PX 
projects [?] shows such a methodology can eliminate bugs of some kind of con- 
structive proofs very quickly and easily. 

In the talk, we will discuss the theoretical and technical problems to be solved 
to apply proof animation to actual proof development. 

We will also give demos of ProofWorks, if facilities are available. The cur- 
rent version of ProofWorks is a JAVA applet proof checker. Figure E represents 
ProofWorks running on a Web browser. Formal proofs under development is 
represented in the left box by Mizar-like formal language. Clicking “Extract” 
button, it extracts a computational content of the proof, which appears in the 
right box. The computational content is pure functional programs in the current 
version of ProofWorks. ProofWorks associates the proof text with program text. 
Positioning a cursor in one of the boxes and clicking one of >> or << buttons. 
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it shows the corresponding places in the other box. Thus, by finding a bug in a 
program in the right box, ProofWorks can show the corresponding points of in 
the proof in the left box, where a bug likely sits. 

A full paper and other informations on proof animation will be available at 

http : //pascal . seg . kobe-u . ac . jp/~hayashi 
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Rigid Reachability* 



Harald Ganzinger, Florent Jacquemard, and Margus Veanes 

Max-Planck-Institut fiir Informatik 
Im Stadtwald, 66123 Saarbriicken, Germany 



Abstract. We show that rigid reachability, the non-symmetric form of 
rigid iJ-unification, is undecidable already in the case of a single con- 
straint. From this we infer the undecidability of a new rather restricted 
kind of second-order unification. We also show that certain decidable 
subclasses of the problem which are "P-complete in the equational case 
become EXPTIME-complete when symmetry is absent. By applying 
automata-theoretic methods, simultaneous monadic rigid reachability 
with ground rules is shown to be in EXPTIME. 



1 Introduction 

Rigid reachability is the problem, given a rewrite system R and two terms s and 
t, whether there exists a ground substitution cr such that scr rewrites in some 
number of steps via Ra into ta. The term “rigid” stems from the fact that for no 
rule more than one instance can be used in the rewriting process. Simultaneous 
rigid reachability is the problem in which a substitution is sought which simulta- 
neously solves each member of a system of reachability constraints (Ri,Si,ti). A 
special case of [simultaneous] rigid reachability arises when the Ri are symmet- 
ric, containing for each rule I r also its converse r ^ 1. The latter problem was 
introduced in HH as “simultaneous rigid A-unification” . (Symmetric systems R 
arise, for instance, from orienting a given set of equations E in both directions.) 
It has been shown in [5| that simultaneous rigid A-unification is undecidable, 
whereas the non-simultaneous case with just one rigid equation to solve is NP- 
complete m- The main result in this paper is that for non-symmetric rigid 
reachability already the case of a single reachability constraint is undecidable, 
even when the rule set is ground. From this we infer undecidability of a rather re- 
stricted form of second-order unification for problems which contain just a single 
second-order variable which, in addition, occurs at most twice in the unification 
problem. The latter result contrasts a statement in m to the opposite. 

The absence of symmetry makes the problem much more difficult. This phe- 
nomenon is also observed in decidable cases which we investigate in the second 
part of the paper. For instance we prove that a certain class of rigid problems 
which is 7^-complete in the equational case becomes EXPTIME-complete when 
symmetry is absent. Our results demonstrate a very thin borderline between the 
decidable and the undecidable fragments of rigid reachability with respect to sev- 
eral syntactical criteria. In particular, for ground R and variable-disjoint s and 

* A full version of this paper is available as MPI-I Research Report MPI-I-98- 2-013. 
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t, the problem is undecidable, whereas it becomes decidable when, in addition, 
either s or t is linear. 

In the Section 0 we will apply automata-theoretic methods to the monadic 
case and establish an EXPTIME upper bound for monadic simultaneous rigid 
reachability for ground rewrite systems. This generalizes the analogous result 
of 1171 for simultaneous rigid E-unification. Also, our proof is more direct and 
provides a better upper bound, closer to the PSPACE lower bound given in HH 
A PSPACE upper bound for this problem has been proved more recently in a 
joint work with Cortier ^ . 

2 Preliminaries 

A signature A is a collection of function symbols with fixed arities > 0 and, 
unless otherwise stated, E is assumed to contain at least one constant, that 
is, one function symbol with arity 0. The set of all constants in E is denoted 
by Con(E). We use a,b,c,d,a\, . . . for constants and for function 

symbols in general. A designated constant in E is denoted by c^;. 

A term language or simply language is a triple L = {El,Xl,Tl) where (i) 
El is a signature, (ii) Xl {x, y,x\,yi,...) is a collection of first-order variables, 
and (iii) Tl {F, G, Fi,F' , . . .) is a collection of symbols with fixed arities > 1, 
called second-order variables. The various sets of symbols are assumed to be 
pairwise disjoint. Let L be a language. L is first-order, if Fl is empty; L is 
second-order, otherwise. L is monadic if all function symbols in El have arity 
< 1. The set of all terms in a language L, or L-terms, is denoted by 7^. We 
use s, t, I, r, si, . . . for terms. We usually omit mentioning L when it is clear from 
the context. The set of first-order variables of a term t is denoted by Varft). A 
ground term is one that contains no variables. The set of all ground terms in 
a language L is denoted by 7};^ . A term is called shallow if all variables in it 
occur at depth < 1. The size |jt|| of a term t is defined recursively by: ||t|| = 1 if 
t e XLUCon(EL) and ||/(ti, . . . , t„)|| = ||ti || -k . . . -k ||t„|| -k 1 when / S ElUFl- 

We assume that the reader is familiar with the basic concepts in term rewrit- 
ing inii|. We write u[s] when s occurs as a subterm of u. In that case u[t\ denotes 
the replacement of the indicated occurrence of s by t. An equation in L is an 
unordered pair of L-terms, denoted by s « t. A rule in L is an ordered pair of 
L-terms, denoted by s ^ t. An equation or a rule is ground if the terms in it are 
ground. A system is a finite set. Let i? be a system of ground rules, and s and t 
two ground terms. Then s rewrites in R to t, denoted by s-^t, if t is obtained 
from s by replacing an occurrence of a term Z in s by a term r for some rule I —>■ r 
in R. The term s reduces in R to t, denoted by s t, if either s = t or s rewrites 
to a term that reduces to t. R is called symmetric if, with any rule I r in R, R 
also contains its converse r 1. Below we shall not distinguish between systems 
of equations and symmetric systems of rewrite rules. The size of a system R is 
the sum of the sizes of its components: ||i?|| = II^ID- 

Rigid Reachability. Let L be a first-order language. A reachability constraint, or 
simply a constraint in L is a triple (i?, s, t) where i? is a set of rules in L, and s and 
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t are terms in L. We refer to R, s and t as the rule set, the source term and the 
target term, respectively, of the constraint. A substitution 9 in L solves (R,s,t) 
(in L) if 9 is grounding for R, s and t, and s9-^t9. The problem of solving 
constraints (in L) is called rigid reachability (for L). A system of constraints is 
solvable if there exists a substitution that solves all constraints in that system. 
Simultaneous rigid reachability or SRR is the problem of solving systems of 
constraints. Monadic (simultaneous) rigid reachability is (simultaneous) rigid 
reachability for monadic languages. 

Rigid E -unification is rigid reachability for constraints {E,s,t) with sets of 
equations E. Simultaneous Rigid E -unification or SREU is defined accordingly. 

Finite Tree Automata. Finite bottom-up tree automata, or simply, tree au- 
tomata, from here on, are a generalization of classical automata Using 

a rewrite rule based definition 1^1^, a tree automaton (or 714) A is a quadruple 
{Qa,^a,Ra,Fa), where (i) Qa is a finite set of constants called states, (ii) Ea 
is a signature that is disjoint from Qa, (iii) Ra is a system of rules of the form 
/(gi, . . . ,qn) ^ q, where f e Ea has arity n > 0 and g, gi, . . . , g„ S Qa, and 
(iv) Fa G Qa is the set affinal states. The size of a TA A is ||A|| = |Qa| + ||.Ra||- 

We denote by T(A, q) the set {t € \ of ground terms accepted by 

A in state g. The set of terms recognized by the TA A is the set 9)- 

A set of terms is called recognizable or regular if it is recognized by some TA. 

Word automata. In monadic signatures, every function symbol has an arity 
at most 1, thus terms are words. For monadic signatures, we thus use the 
traditional, equivalent concepts of alphabets, words, finite automata, and reg- 
ular expressions. A word with a variable ai ... a„x corresponds to the term 
ai(a 2 (. . . Unix))) G Ts- The substitution of a: by a term u is the same as the 
concatenation of the respective words. A finite (word) automaton A is a tuple 
{Qa, Ea, RA,qA, Fa) where the components Qa, Fa, Ra, Fa have the same 
form and meaning as the corresponding components of a tree automaton over a 
monadic signature, and where, additionally, q\ is the initial state. A transition 
a(g) — > q' of Ra {a G Ea, q, q' G Qa) is denoted q-^q' . 

Second-Order Unification. Second-order unification is unification for second- 
order terms. For representing unifiers, we need expressions representing func- 
tions which, when applied, produce instances of a term in the given language L. 
Following Goldfarb PS] and Farmer dH, we, therefore, introduce the concept of 
an expansion L* of L. Let {zi}i>i be an infinite collection of new symbols not 
in L. The language L* differs from L by having {zi}i>i as additional first-order 
variables, called bound variables. The rank of a term t in L*, is either 0 if t 
contains no bound variables {i.e., t G Tif), or the largest n such that occurs 
in t. Given terms t and ti,t 2 , . . . ,tn in L*, we write t[ti, t 2 , ■ . ■ , tn] for the term 
that results from t by simultaneously replacing Zi in t by ti for 1 < i < n. An 
L*-term is called closed if it contains no variables other than bound variables. 
Note that closed L*-terms of rank 0 are ground L-terms. 
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A substitution in L is a, function 6 with finite domain dom(0) C U Tl 
that maps first-order variables to L-terms, and n-ary second-order variables to 
L*-terms of rank < n. The result of applying a substitution 6 to an L-term s, 
denoted by s9, is defined by induction on s: 

1. li s = X and x S dom(0) then s9 — 9{x). 

2. If s = X and x ^ dom(0) then s9 = x. 

3. If s = F{ti, . . . ,tn) and F G dom(0) then s9 — 9{F)[ti9, . . . , 

4. If s = F{ti, . . . ,tn) and F ^ dom(6*) then s9 — F{ti9, . . . , tn9). 

5. If s = f{ti, ...,tn) then s9 = f{ti9, . . .,tn9). 

We also write F9 for 9{F), where F is a second-order variable. A substitution is 
called closed, if its range is a set of closed terms. Given a term t, a substitution 
9 is said to be grounding for t if t9 is ground, similarly for other L-expressions. 
Given a sequence t — t\, . . . ,tn of terms, we write t9 for t\9 , . . . , 

Let F be a system of equations in L. A unifier of F is a substitution 9 
(in L) such that s9 = t9 for all equations s « t in F. F is unifiable if there 
exists a unifier of F. Note that if E is unifiable then it has a closed unifier that 
is grounding for F, since is nonempty. The unification problem for L is 
the problem of deciding whether a given equation system in L is unifiable. In 
general, the second-order unification problem or SOU is the unification problem 
for arbitrary second-order languages. Monadic SOU is SOU for monadic second- 
order languages. By SOU with one second-order variable we mean the unification 
problem for second-order languages L such that \Fl \ = 1. 

Following common practice, by an exponential function we mean an integer 
function of the form f{n) = where F is a polynomial. The complexity class 
EXP TIME is defined accordingly. 

3 Rigid Reachability Is Undecidable 

We prove that rigid reachability is undecidable. The undecidability holds already 
for constraints with some fixed ground rule set which is, moreover, terminating. 
Our main tool in proving the undecidability result is the following statement. 

Lemma 1 One can effectively construct two tree automata = 

(Qniv , , Fmv, {^niv}) ; -^id — (Qid , Fja , ; and two canontcal systems 

of ground rules 7Ti,il2 C x where the only common symbol in A^v 

and Aid is a binary function symbol .0 such that, it is undecidable whether, 
given Ud G there exists s G T(Amv) and t G T(Aid) such that s-^t and 

^id ■ s ~Jjf't. 

The main idea behind the proof of Lemmanis illustrated in Figure [D In the 
rest of this section, we consider fixed Amv, Aid, Fli and 772 given by Lemma [D 
Undecidability of simultaneous rigid F-unification follows from this lemma 
by viewing the rules 77i„v and 7?id of the automata Amv and Aid, respectively, as 



^ We write . (“dot”) as an infix operator. 
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represent a se- 
quence of moves of a given Turing machine, where vf is the successor of Vi according 
to the transition function of the TM. 

Each term t recognized by Aid represents a sequence of IDs of the TM 
{wi,W 2 , . . . , W„). 

The two rewrite systems I7i and II 2 are such that s reduces in 77i to t if and only if 
Vi = Wi for 1 < i < k = n, and fid ■ s reduces in II 2 to t if and only if fid represents 
wi, = nii+i for 1 < i < n, and Wn is the final ID of the TM. It follows that such 
s and t exist if and only if the TM accepts the input string represented by fid- 



well as the rewrite systems 7Ti and II 2 , as sets of equations, and by formulating 
the reachability constraints between s and t as a, system of rigid equations. It is 
not possible, though, to achieve the same effect by a single rigid if-unification 
constraint for a combined system of equations. The interference between the 
component systems cannot be controlled due to the symmetry of equality. This 
is different for reachability where rewrite rules are only applied from left to 
right. In fact, our main idea in the undecidability proof is to combine the four 
rewrite systems i?mv, .Rid, Ri, and II 2 into a single system and achieve mutual 
non-overlapping of rewrite rules by renaming the constants in the respective 
signatures. 

3.1 Renaming of Constants 

For any integer m and a signature S we write for the constant-disjoint 

copy of S where each constant c has been replaced with a new constant 
we say that has label m. Note that non-constant symbols are not renamed. 
For a ground term t and a set of ground rules R over S, we define and 
over accordingly. 

Given a signature S and two different integers m and n, we write for 

the following set of rules that simply replaces each label m with label n: 

^ I ^(m) ^ ^(n) | ^ ^ Con{S) }. 

We write TTI"*’"), where 77 is either 77i or 772, for the following set of rules: 

^ I i(m) ^(n) | ^ ^ g yj 



Lemma 2. Let m, n, k and I be distinct integers. The statements (i) and (ii) 
are equivalent for all all s G and tid, t G 7};.^ . 
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(^) 

(ii) 



s-^t and iid ■ s-^t. 



o(™) . 



n 









Proof. The left-hand sides of the rules in 7Ti and II 2 are terms in and the 
right-hand sides of the rules in U\ and II 2 are terms in But Srm and Tlid 
are constant-disjoint. Kl 



3.2 The Main Construction 

Let i?u be the following system of ground rules: 

i?u = u R<il u U U r^^'> u u u u 

Note that constants with odd labels occur only in the right-hand sides of rules 
and can, once introduced, subsequently not be removed by Ru- Let /u be a new 
function symbol with arity 12. We consider the following constraint: 

[ Ru, fu{ Xo, X2, X0,X2, Vi, ye, V4, ye, Vi, xq, ye, ■ X2 ),\ 
f / (0) (2) (4) (6) N 

\ /u( glnv, 9mv, a;i, a;i, gljL 2/3, 2/3, 2/5, 2/5, 2/7, ?/7 )/ 

Our goal is to show that solvability of for a given tjd S Tlid, is equivalent to 
the existence of s and t satisfying the condition in Lemma ^ Note that, for all 
ground terms ti and Si, for 1 < 7 < 12, 

/u(ti,...,ti 2 )-^/u(si,...,si 2 ) (for 1 < i < 12). 

As a first step, we prove a lemma that allows us to separate the different sub- 
systems of i?u that are relevant for the reductions between the corresponding 
arguments of /u in the source term and the target term of ([Q. 

Lemma 3. For every substitution 9, 9 solves the constraint m if and only if 9 
solves the system of constraints. 



( 




Xq, 


( 0 ) 

glnv 


) 1 




( 




X2, 


( 2 ) 

gmv 


) 


> (2) 


( 


y ( 0 . 1 ) 

^mv 5 


Xo, 


Xi 


) 


( 


W (2.1) 
^mv 7 


X2, 


Xi 


) J 




( 


Wd ’ 


2/4, 


(4) 

9id 


) 1 




( 


Wd ’ 


ye. 


(6) 

9id 


) 


> (3) 


( 




2/4, 


2/3 


) 


( 


eJ^’^K 


ye. 


2/3 


) J 




( 




2/4, 


2/5 


) 1 


(4) 


( 


7Ji(0.5)^ 


Xo, 


2/5 


) J 


( 




ye. 


2/7 


) 1 


(5) 


( 




tZ^-X2, 


2/7 


) J 
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Proof. The direction ‘<J=’ is immediate, since if 0 solves a constraint {R, s, t) then 
obviously it solves any constraint (i?', s, t) where R C Rf 

The direction ‘=>’ can be proved by case analysis, many cases being symmetrical. 
For instance, we show that if 9 solves (P), then 9 solves the two first constraints 
of 0, namely Xi9j^jy^q^^ for i = 0,2. We give the proof for i = 0, which, by 

symmetry, also proves the case i = 2. We know that Xo9-^qml. We prove by 
induction on the length of reductions that, for all t, if t-^q^l then 
The base case (reduction is empty) holds trivially. If the reduction is nonempty, 
then we have for some Z — > r S i?u, and by using the induction hypothesis, that 
t s <7mv . Therefore all constants in r have label 0, since r is a subterm 

of s and s G "^{o)uq(o) ■ Hence I ^ r G i?mv, and consequently t . Kl 

The following lemma relates the solvability of (0 to Lemma 0 

Lemma 4. For tjd G the constraint m is solvable if and only if there 

exists s G T(Amv) and t G T(Aid) such that s^^t and fid ■ s-^t. 

Proof. {^) Assuming that given s and t exist, define Xi9 = for i G {0, 1,2} 
and yi9 = for i G {3, 4, 5, 6, 7}. It follows easily from Lemma|^and Lemma0 
that 0 solves ©. 

(=>) Assume that 9 solves 0. By LemmaEl 9 solves @-(0. First we observe 
the following facts. 

(i) From 9 solving it follows that there exists s G T(Amv) such that Xq9 = 
and X 20 = 

(ii) From 9 solving 0, it follows that there exists t G T(Aid) such that 2/46* = t^^'^ 
and ye9 = t^^\ 

From 9 solving (0 and by using (ii), it follows that y^9 = t^^\ Now, due to the 
second component of 0 and by using (i), we get that: ■ 

From 9 solving (0 and by using (ii), it follows that yr9 = t^'^\ Now, due to 
the second component of 0 and by using (i), we get that: ^ *2,7) 

Finally, use Lemma El Kl 

We conclude with the following result. 

Theorem 1. Rigid reachability is undecidable. Specifically, it is undecidable al- 
ready under the following restrictions: 

— the rule set is some fixed ground terminating rewrite system; 

— there are at most eight variables; 

— each variable occurs at most three times; 

— the source term and the target term do not share variables. 

Proof. The undecidability follows from Lemma E and Lemma 0 The system i?u 
is easily seen to be terminating, by simply examining the subsystems. The other 
restrictions follow immediately as properties of 

We have not attempted to minimize the number of variables in ©• Observe also 
that all but one of the occurrences of variables are shallow (the target term is 
shallow) . 
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4 A New Undecidability Result for SOU 

We prove that SOU is undecidable already when unification problems contain 
just a single second-order variable which, in addition, occurs twice. This result 
contrasts a claim to the opposite in m- Let Uu be the signature consisting of 
the symbols in i?„ and the symbol /u- Let Ru = {h ^ ri |l<i<TO}. Let lu 
denote the sequence /i, Z 2 , • ■ ■ dm and the sequence ri, T 2 , . . . , Let Lu be 
the following language: 

Lu = (Uu, {xo,Xi,X2,y3,y4,y5,V6,y7}) 

Let Lu be a new second-order variable with arity m -|- 1. Let cons be a new 
binary function symbol and nil a new constant. The language Li is defined as 
the following expansion of Lu: 



Li = (Lu U {cons, nil}, {Lu}). 

We can show that, given tjd G Tsidi the following second-order equation in Li is 
solvable if and only if the constraint oa is solvable: 

Fu{lu, cons {fu{q^l,q^l, Xi,Xi,ql^\q[^\y 3 , 7/3, j/5, 7/5, yr, 7/7), ail)) ~ 

( 7 '\ 

cgus{fu{xo,X2,xo,X2,y4,y6,y4,y6,y4,xo,y6,t\/ ■ a;2),Lu(ru,nil)) (6) 



Lemma 5. Given fid G dbid) GP solvable if and only if is solvable. 

Proof. The direction ‘=>’ follows from Lemma 2] and the observation that if 
9 solves m then x9 G for all x G Xl^- In particular, it is not possible that 
cons or nil appear in the terms that are substituted for X^^ . 

We prove the other direction. Assume that 9 solves (jOI). We show that 9 
solves dU. A straightforward inductive argument shows that F^9 is an L{-term 
of rank m -I- 1 of the following form: (recall that Zi is the Lth bound variable) 



Lu6» = cons(si,cons(s 2 , ■ ■ . , cons (sfc, Zm+i ) • • •)), 

for some k > 1, by using that i?u is ground and that cons ^ Lu (see 
Lemma 1]). Hence, since 9 solves (jOj), it follows that 

cons fsi . . . cons fs.4-1 [L,. L]. . . . cons (f6*. nil) ...)...) = 

cons ist. . . . cons fs. [r„.nil]. . . . cons ist- [r„. nil], nil) ■■■)■■■). ' 

where s is the source term of 0, t is the target term of o, and L = cons (f6>, nil ). 
So there exists a reduction in i?u U {f' — > nil} of the following form: 

Sl[iu,d] S2[lu,t'] Sk[lu,t'] 

II \* II ••• II \* II 

s9 Si[ru,nil] Sfe_i[ru,nil] Sfe[ru,nil] 
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This means that s9 i.e.. 



/u( 3:^0, 32, 30,32, J/4, 2/6, 2/4, 2/6, 2/4, 3o , //6 , ■ 32 ) 6* 



-RuU{t^ — » nil l 
f ( (0) (2) (4) (6) 

/u( 9mv, gmv, 3i, Xi, , 2/3, 2/3, 2/5, 2/5, 2/7, 



2/7 






It is sufficient to show that xqO, X 2 O, uaO, yeO G Because then s6 G and 
the rule t' — s- nil can not be used in the reduction of s9 to t9, since nil does not 
occur in ifu. To begin with, we observe that 



Xi9 



/-nil} 



'dmv 



(i = 0,2) and that yi9 



»nil| 



"9id (* = 4,6). 



It follows by easy induction on the length of reductions that t' ^ nil can 
not be used in these reductions, since nil does not not occur in R^- Hence, 
xq9, x 29, yi9, ye9 G , as needed. Kl 



We conclude with the following result, that follows from Lemma El Lemma E] 
and Lemma El 



Theorem 2. Second-order unification is undecidable with one second-order va- 
riable that occurs at most twice. 



The role of first-order variables in the above undecidability result is important. 
Without first-order variables, and if there is only one second-order variable that 
occurs at most twice, second-order unification reduces to ground reachability, 
m, and is thus decidable. 



5 Decidable Cases 

We show that rigid reachability and simultaneous rigid reachability is decid- 
able when the rules are all ground, either the source s or the target t of any 
constraint is linear, and the source s and the target t are variable-disjoint, 
that is, Var{s) n Var{t) = 0. The non-simultaneous case then turns out to be 
EXPTIME-complete. EXPTIME-hardness holds already with just a single vari- 
able. This contrasts with the fact that rigid E-unification with one variable is 
7^-complete [Zj. When additionally both the source and target terms are linear, 
then rigid reachability and simultaneous rigid reachability are both 7^-complete. 

Note that the only difference between the conditions for undecidability of 
rigid reachability in Theoremnand the condition for decidability in Theorems^ 
El and Elis the linearity of source and (or) target terms. In the rest of the section, 
we assume fixed a signature E. 



5.1 Decidable Cases of Rigid Reachability 

We begin with defining a reduction from rigid reachability to the emptiness 
problem of the intersection of n regular languages recognized by tree automata 
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Ai,...,An- This intersection emptiness problem is known to be EXPTIME- 
complete, see and m- We may assume the states sets of the Ai,. . . ,A„ 

to be disjoint and that each of these tree automata has only one final state. We 
call these final states, respectively, . For stating the following lemma, 

we extend the signature X by a new symbol / of arity n, and assume that n > 1. 



Lemma 6. T{Ai) H . . . H T(An) ^ ^ iff the following constraint has a solution: 
(i?Ai u . . . u , f{x,...,x), /(gii,...,gA„)) 

Proof. (=i>) is obvious. For (<i=) we use the fact that the new symbol does not 
occur in any transition rule of the A\, , A„. Therefore, and since the state sets 

are disjoint, any reduction in f(x, . . . ,x)0 u*uAa ~ ■ ■ ■ i9a„) (where 9 

is a solution) takes place in one of the arguments of f{x, . . . ,x)9. Moreover, if 
the reduction is in the i-th subterm, it corresponds to the application of a rule in 
i?Ai • (It is possible, though, to apply a start rule in Ra within the f-th subterm, 
with i ^ j.) But any reduction of this form blocks in that the final state Qa 
can not be reached from the reduct.) The fact than n > 1 prohibits states of the 
automata to appear in x9. Kl 



Theorem 3. Rigid reachability is EXPTIME-/iard even when the rules and the 
target are ground and the source contains only a single variable. 

For obtaining also an EXPTIME upper bound for a somewhat less restrictive 
case of rigid reachability we will apply tree automata techniques. In particu- 
lar, we will exploit the following fact of preservation of recognizability under 
rewriting, which is a direct consequence of results in 0. 

Proposition 1 ( 123 ). Let R be a ground rewrite system and t a linear term. 
The set {u £ Ts \ u^ta,ta ground} is recognizable by a tree automaton A the 
size of which is in 0(||f|j * |li?|P). 

Proposition 2. The subset of of ground instances of a given linear term s 
is recognizable by a tree automaton Ag the size of which is linear in the size of s. 



Theorem 4. Rigid reachability, when rules are ground, the target is linear and 
the source and the target are variable- disjoint, can be decided in times 0(n^^+^), 
where n is the size of the constraint, and k is the total number of occurrences of 
non-linear variables in the source term. 

Observe that the time upper bound becomes O(n^) when s is linear, since k = 0 
in this case. 

Proof. Assume to be given a reachability constraint (i?, s, t) of the required form. 
We first construct a tree automaton A from t and R with the properties as 
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provided by Proposition 0 that is, recognizing the predecessors with respect to 
R of the ground instances of t. The size ||A|| of A is in 0{rA). 

If the source s is linear, then there is a solution for (i?, s, t) iff T{A)r\T{As) yf 
0, where As is the tree automaton of Proposition El Since the intersection of 
recognizable languages is recognizable by a tree automaton the size of which is 
the product of the original tree automata, the solvability of the given constraint 
can be decided in time 0(||s|| * tA) C 0{rA). 

If the source s is not linear, we reduce our rigid reachability problem to |Qa|^ 
problems of the above type. We assume wlog that A has only one final state . 
Let (si) be the finite sequence of terms which can be obtained from the source 
s by the following replacements: for every variable x which occurs j > 2 times 
in s, we choose a tuple (gi, . . . , qj) of states of A such thal0 r\i<jT{A, qi) ^ 0, 
and we replace the fth occurrence of x with qi for i < j m. s. 

Then the two following statements are equivalent: 

(i) the constraint (R,s,t) bas a solution. 

(ii) one of the constraints {Ra, Si, q^) has a solution. 

(i) (ii): Assume that cr is a solution of the constraint (i?, s,t). This means 
in particular that sa G T{A) i.e. sa^q^ . Let r be the restriction of cr to the set 
of linear variables of s and 9 be its restriction to the set of non-linear variables 
of s. We have s9 > Si by construction and r is a solution of the constraint 
(RA,Si,q^). 

(ii) => (i): Assume SiT-^^q^ for some i and some grounding substitution 
r. To each non-linear variable x of s, we associate a term G fl ,<jT{A,qi) (it 
exists by construction) where gi,. . . ,qj are the states occurring in Si at positions 
corresponding to a; in s. This defines a substitution 9 on the non-linear variables 
of s (by x9 = Sx) such that st9 G T{A). Hence sr9^ta for some grounding 
substitution a which is only defined on the variables of t. Since Var(s)n Var(t) = 
0, the domains of 0, r and cr are pairwise disjoint and tU0U(t is indeed a solution 
to the constraint (R,s,t)- 

Complexity: The number of possible Si is smaller than |Qa|^ *-e. it is in 
0{rA^). Rigid reachability for one constraint (A, Si,g^) can be decided in time 
0{rA)^ according to the first part of this proof. Altogether, this gives a decision 
time in 0(n^^+'^). Kl 

By symmetry, rigid reachability is also decidable when rules are ground, the 
source is linear and the source and the target are variable-disjoint, with the same 
complexities as in Theorem 0 according to the (non)-linearity of the target. 

As a consequence we obtain these two theorems: 

Theorem 5. Rigid reachability is EXPTIME-compfefe when rules are ground, 
the source and the target are variable-disjoint, and either the source or the target 
is linear. 



^ One can decide this property in time ||A||*’ G 0{n^^). 
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Theorem 6. Rigid reachability is V -complete when the rules are ground, the 
source and the target are variable-disjoint, one of the source or the target is 
linear, and the number of occurrences of non-linear variables in the other is 
bounded by some fixed constant k independent from the problem. 

Note that the linear case corresponds to fc = 0. 

Proof. For obtaining the lower bound, one may reduce the 7^-complete uniform 
ground word problem (see jl isj ! to rigid reachability where rules, source and 
target are ground. The upper bound has been proved in Theorem^. Kl 

5.2 Decidable Cases of Simultaneous Rigid Reachability 

We now generalize Theorem El to the simultaneous case of rigid reachability. 

Theorem 7. Simultaneous rigid reachability is V -complete for systems of pair- 
wise variable-disjoint constraints with ground rules, and sources and targets that 
are variable-disjoint and linear. 

Proof. Apply Theorem El separately to each constraint of the system. Kl 

Similarly, we can prove: 

Theorem 8. Simultaneous rigid reachability is EXPTIME-compfete for systems 
of pairwise variable-disjoint constraints with ground rules, and sources and tar- 
gets that are variable-disjoint and such that at least one of them is linear for 
each constraint. 

The problem remains in V (see Theorem EJ if there is a constant k independent 
from the problem and for each Si (resp. ti) which is non-linear, the total number 
of occurrences of non-linear variable in Si (resp. tf) is smaller than k. 

We can relax the conditions in the above Theorem 0 by allowing some com- 
mon variables between the Si. 

Theorem 9. Simultaneous rigid reachability is in EXPTIME when all the rules 
of a system of constraints ((i?i, si, ti),. . . , {Rm, Smjtm)) are ground, either ev- 
ery ti is linear or every Si is linear, and for all i,j < n, the terms Si and tj and 
respectively the terms U and tj (when i ^ j) are variable-disjoint. 

Proof. We prove for the case where every ti is linear, the other case follows by 
symmetry. We reduce this problem to an exponential number of problems of the 
type of Theorem 0 

We associate a TA Ai to each pair (ti,Ri) which recognizes the language 
{u € Ts \ u^-^ti(j,ti(j ground} (see Proposition 0. The size of each Ai is 
in 0(11 till * ||i?i|p). We assume wlog that the states sets of the Ai are pairwise 
disjoint and that the final states sets of the Ai are singletons, namely Fa^ = {q\}- 
We construct for each i < m a sequence of terms (sij ) obtained by replacement 
of variables occurrences in Si (regardless of linearity) by states of Ai. To each 
TO-tuple (si jy , . . . , Sm :/„)) we associate a system which contains the constraints: 



16 



Harald Ganzinger, Florent Jacquemard, and Margus Veanes 



1. {Ra, , sij, {Ra^ , Si 

2. for every variable x which occurs k times in {si, . . . , Sn}, with k > 2, 

{Ra, l±l...l±li?A„,/^(a;, ...,a;),/^(gi,...,gfc)), where is a new function 
symbol of arity k and qi,. . . ,qu are the states occurring in si^^q,. . . at 

the positions corresponding to a: in si,. . . ,Sm- 

Then the system si, ti),. . . , (i?„, s„, t„)) has a solution iff one of the above 
systems has a solution. Each of these systems has a size which is polynomial 
in the size of the original system and moreover, each fulfills the hypothesis of 
Theorem Eland can thus be decided in EXP TIME. Since the number of the above 
systems is exponential (in the size of the initial problem), we have an EXPTIME 
upper bound for the decision problem. Kl 



6 Monadic Simultaneous Rigid Reachability 

Our second main decidability result generalizes the decidability proof of Monadic 
SREU for ground rules HZ|. Moreover, our proof gives an EXPTIME upper 
bound to monadic SREU for ground rules. Although, the lower bound is known 
to be PSPACE jTZj, no interesting upper bound has been known before for this 
problem. We shall use basic automata theory for obtaining our result. More 
recently, and using different techniques, monadic rigid reachability with ground 
rules was found to be decidable also in PSPACE El- The presentation in this 
section will be in terms of words rather than monadic terms. 

Recognizing Substitution Instances. We will first show that n-tuples of substi- 
tution instances of monadic terms are recognizable. For this purpose we let 
automata compute on {{S U {T})")*, where T is a new symbol. The represen- 
tation of a pair of words of S* as a word ((XU{T})^)* is given by the product 
0 defined as follows: 

ai . . ,a„^bi . . .bm = (ai,6i), . . . , (a„,6„), (T,6„+i), . . . , (T,6m) if n < m 
ai . . .Gn^bi . . .bm = (ai,6i), . . . , (am,6m), ( 0 ^+ 1 , T), . . . , (a„,T) if n > to 

We extend this definition of 0 associatively to tuples of arbitrary length in the 
obvious way. 

Lemma 7. Let L\, . . . , Ln be recognizable subsets of S* . Then Li 0 . . . 0 L„ 
is recognizable (in ((X U {T})")*/ The size of the product automaton can be 
bounded by the product of the sizes of its factor automata. 

When constructing an automaton for S* 0 L from an automaton A for L, the 
size of the product automaton can be bounded by c * ||A||, for some (small) 
constant c if the alphabet X is assumed to be fixed. 

Theorem 10. Given p monadic S-terms Si, the set of tuples of their ground 
instances {s\9 0 ... 0 Sp0 \ 9 ground } is recognizable by an automaton with size 
exponential in X)j=i kil- 
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The proof of Theorem EH will be based on three technical lemmas. 

Lemma 8. Let L be a language of tuples of the form ai ... an that is 
reeognized by A. Then, for any permutation ir, the language 

Lj^ — ® a.j^n I ^ On € -b} 

can be recognized by an automaton of the same size as A. 

Proof. We apply the permutation tt to the tuples of symbols appearing in the 
rules of Ra- KI 



Lemma 9. Given s,t G S* , the set {su tu | u S if*} is recognizable by an 
automaton with size exponential in |s| + |t|. 

Proof. The automaton reads words s'u 0 t'v, checking that s' = s and t' = t 
and that u = v. For the latter test, the automaton has to memorize (in states) 
the last ||t| — |s|| symbols of S read in s'u. This is the reason of the exponential 
number of states. An example of the construction for S = {0, 1}, s = 0 and 
t = 101, is given in figure E ® 

The construction underlying the proof of Lemma |2| cannot be generalized to the 
case of non-monadic signatures. However, one can generalize it (in the monadic 
case) to the product of an arbitrary number p of words. 

Lemma 10. Given p >1, the set {siu 0 . . . ® SpU | u € if*} is recognizable by 
an automaton with size exponential in l®il- 

Proof. By induction on p. The base case p = 1 is trivial, and the case p = 2 
is proved in Lemma 0 Assume that p > 2 and that we have an automaton A 
with T{A) = {siM 0 ... 0 SpU I u G A*} and one more word Sp+i. Let A" be an 
automaton such that T{A”) = {spU Sp+iu | u G A*}. A" may be obtained by 
applying Lemma M again. Clearly, 

(T(A) (g) r*) n ( r*(8i...(8) A’* (gr(A")) = {sm g . . . g sp+iu \ u g a*}. 

p-i 

According to Lemma Q this language is recognizable by an automaton A! , and, 
by LemmasQandH ||A'|| is of the order . Kl 

Now we are ready to prove Theorem 1 1 1)L 

Proof. The terms Si are either ground or have the form Si{xi) with one occur- 
rence of a variable Xi “at the end”. Let, for any variable x occurring in any of 
the terms, , . . . , si^ be those terms among the Si which contain x. According 
to Lemma HU the language 



Lj = {sii9 (g . . . (g I 6* ground } 
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Figure 2: An example illustrating the proof of Lemma|2| 



is recognizable by an automaton of size exponential in From the Lem 

-ma 0 we infer that Lf = Lf 0 if* 0 ... 0 S * , with p — n factors of S* , is 
recognizable by an automaton with size exponential in |sij |. Finally, L® = 
{L 2 )tt, with 7T a permutation which maps the first n indices j to ij, that is, puts 
the Si - into their right place in the sequence 1 . . . p, is also recognizable by an 
automaton of the same size, see Lemma 0 Moreover, it is not difficult to see 
that 



Lg — {ti ® ® tp I ti = Si if Si ground, and ti G H*, otherwise} 

is recognizable by an automaton with size polynomial in max|si|. The desired 
language arises as the intersection of the languages and Lg so that recogniz- 
ability with the stated complexity bound follows. Kl 

For solving reachability constraints, we also need to recognize rewriting re- 
lations. 
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Theorem 11 ([fi*]). Given a ground rewrite system R on E* , the set {u 0 f | 
u^v} is recognizable by an automaton the size of which is polynomial in the 
size of R. 

Theorem 12. Rigid reachability in monadic signatures is in EXPTIME when 
the rules are ground. 

Proof. Let (i?, s, t) be a constraint over the monadic signature E. We show that 
the set of “solutions” S = {s9 0 tO | sO^tO} is recognizable, s and t may 
contain at most one variable which we denote by x and y, respectively. These 
two variables may or may not be identical. Applying Theorem El we may infer 
that {s9 ® to I x9 and yO ground } is recognizable by an automaton A with size 
exponential in |s| + |t|. By Theorem[ni the set {tt 0 t; | m, t; G E*,u^v} is 
recognizable by an automaton A! with size polynomial in the size of R. Clearly 
S = T{A) n T(A'), and emptiness is decidable in time linear in the size of 
the corresponding intersection automaton (which is exponential in |s| + |t| and 
polynomial in the size of i?). Kl 

The extension to the simultaneous case of Theorem ll2l generalizes and improves 
a result of Id. 

Theorem 13. Simultaneous rigid reachability in monadic signatures is decid- 
able in EXPTIME when the rules are ground. 

Proof. The construction is a generalization of the one for Theorem 1121 Suppose 
we are given the system of constraints (i?i, Si,ti), 1 < i < n. We first construct 
an automaton Ai for each i < n such that T{Ai) = {u (A v \ u,v G E* ,u^gv}. 
Then A = 0"^^ Ai (see Lemma 0) recognizes the language: 

T(A) = {u\ 0 0 U 2 0 . . . 0 0 I for all i < n, Ui, Vi G E*,Ui^-^Vi}. 

The size of A is the product of the sizes of the Ai, hence of order M” where M 
is the maximum of the sizes of the Ai. In Theorem cni we have shown that the 
language 

= {s\9 0 ti9 0 ... 0 Sn9 0 tnO I 9 ground} 

is recognizable by an automaton A'^ of size exponential in |si| + \ti\. The 
simultaneous reachability constraint is solvable if and only the intersection H 
T{A) is non-empty. According to the respective sizes of the automata in the 
above intersection, this gives an EXPTIME upper-bound for deciding simulta- 
neous rigid reachability. Kl 

7 Conclusion 

We have shown that absence of symmetry makes the solving of rigid reachabil- 
ity constraints in general much harder. In the non-simultaneous case one jumps 
from decidability to undecidability. In the case of ground rewrite rules, source 
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terms with just a single variable, and ground target terms, the complexity in- 
creases from 7^-completeness to EXPTIME-completeness. The undecidability of 
rigid reachability implies a new undecidability result for second-order unification 
problems with just a single second-order variable that occurs twice. We have also 
seen that automata-theoretic methods provide us with rather simple proofs of 
upper bounds in the monadic case. 
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Abstract. Finite tables are commonly used in many hardware and soft- 
ware applications. In most theorem provers, tables are typically axiom- 
atized using predicates over the table indices. For proving conjectures 
expressed using such tables, provers often have to resort to brute force 
case analysis, usually based on indices of a table. Resulting proofs can 
be unnecessarily complicated and lengthy. They are often inefficient to 
generate as well as difficult to understand. Large tables are often man- 
ually abstracted using predicates, which is error-prone; furthermore, the 
correctness of abstractions must be ensured. An approach for modeling 
finite tables as a special data structure is proposed for use in Rewrite 
Rule Laboratory (RRL), a theorem prover for mechanizing equational 
reasoning and induction based on rewrite techniques. Dontcare entries 
in tables can be handled explicitly. This approach allows tables to be 
handled directly without having to resort to any abstraction mechanism. 
For efficiently processing large tables, concepts of a sparse and weakly 
sparse tables are introduced based on how frequently particular values 
appear as table entries. Sparsity in the tables is exploited in correctness 
proofs by doing ease analyses on the table entries rather on the indices. 
The generated cases are used to deduce constraints on the table indices. 
Additional domain information about table indices can then be used to 
further simplify constraints on indices and check them. The methodology 
is illustrated using a nontrivial correctness proof of the hardware SRT 
division circuit performed in RRL. 1536 cases originally needed in the 
correctness proof are reduced to 12 top level cases by using the proposed 
approach. Each individual top level case generated is much simpler, even 
though it may have additional subcases. The proposed approach is likely 
to provide similar gains for applications such as hardware circuits for 
square root and other arithmetic functions, in which much larger and 
multiple lookup tables, having structure similar to the sparse structure 
of the SRT table, are used. 



1 Introduction 

With the declining memory costs and for faster speed, lookup tables are being 
widely used in high performance hardware applications. Recent hardware im- 
plementations of complex arithmetic functions such as division, square root and 
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other arithmetic functions are based on large finite lookup tables with prepro- 
grammed values [3f2| . As an example, the hardware implementation of the radix 
4 SRT division algorithm, the source of the now notorious Intel Pentium bug, 
uses a large finite lookup table for quotient digit selection. This paper is mo- 
tivated by our recent work in which SRT division algorithm was automatically 
verified using the theorem prover Rewrite Rule Laboratory {RRL) usiM several 
different formulations of the quotient digit selection lookup table lSIlTifl . 

Our experience in these verification efforts suggests that most theorem provers 
for mechanizing induction including our own theorem prover RRL do not provide 
adequate support for reasoning about finite tables. Tables used in applications 
typically have structure. Many lookup tables have only a few distinct entries. 
Many of the entries are dont-cares either because those portions of a table are 
not expected to be accessed, or the way such a table is being used, the behavior 
does not depend upon the entry value for those indices. Further, indices may 
be ordered; in many cases indices are subranges over numbers, finite bit vectors 
etc. A typical approach for specifying lookup tables in a prover is by a case 
expression (conditional expression) over the table indices (using if then else like 
expressions, for examples, in PVS, ACL2). Such descriptions are too general, 
and do not bring forth the structure in finite lookup tables. 

A prover would typically resort to brute force case analyses based on the 
table indices regardless of the table structure. This may get prohibitively expen- 
sive for large tables. A hardware implementation of the SRT division algorithm, 
for instance, uses a table with up to 800 entries m- Implementations of arith- 
metic functions such as the square root are based on much larger lookup tables 
mm and many of these implementations use multiple lookup tables fSCE|- 
In order for the verification to scale up to these applications, it is necessary that 
the underlying structure in tables be better exploited by theorem provers. 

The main goal of this paper is to propose an approach for facilitating mech- 
anized reasoning about finite lookup tables in theorem provers such as RRL. We 
propose modeling finite lookup tables as a special data type, much like numbers 
and booleans. Viewing a finite table as a special data type instead of a predi- 
cate or a finite subset of a Cartesian product results in a compact and direct 
axiomatization, without having to bother about index values for which the table 
is not defined. Sparse and weakly sparse tables are defined based on how fre- 
quently particular values appear as table entries. A mechanism for considering 
the dont-care entries is discussed. 

The above concepts are exploited to mechanize reasoning about properties 
of algorithms using finite tables. It is shown that for sparse and weakly sparse 
tables, case analyses performed based on the table entries rather than the ta- 
ble indices, results in fewer top level cases in a proof attempt, thus producing 
compact and elegant proofs. For each table entry, from the instance of a given 
property, simpler constraints on the table indices can be deduced by projec- 

In jS] the correctness of the SRT division algorithm was established in RRL using a 
predicate based formulation of a lookup table whereas PH describes a more direct 
verification strategy using a function based specification of the lookup table. 
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tion using quantifier elimination techniques such that testing them suffices to 
establish the given property. Structure and properties of the table indices can be 
further exploited to reduce the number of index values for which the constraints 
over the indices need to be checked. 

Directly considering tables used in circuit implementations instead of man- 
ually abstracting them using predicates has a distinct advantage, and is likely 
to be preferred by hardware designers. Firstly, abstractions can be error-prone. 
Secondly, their correctness must be established separately. Thirdly, there can be 
a significant abstraction gap between a concrete circuit implementation and its 
description using abstractions, depending upon the nature and complexity of the 
abstraction process. The main argument for abstracting tables is that large tables 
can lead to unsurmountable number of cases, which can be difficult to manage as 
well as there can be considerable redundancy. The proposed approach addresses 
these concerns by firstly avoiding abstractions by directly considering tables as 
they are, and secondly by exploiting the structure of tables-their entries as well 
as indices. 

The power of the proposed approach is especially evident from the radix 4 
SRT division algorithm that employs a large lookup table for quotient digit se- 
lection with unspecified entries. The main invariant establishing the convergence 
of this algorithm is proved by the proposed approach in section 5. The number 
of cases in the proof reduce from the 1536 required in UDI, based on traditional 
modeling of tables with case analyses being performed on table indices, to 12 top 
level cases using the proposed approach. The individual cases generated by the 
proposed approach are also much simpler than those generated by case analyses 
over table indices. 

An integration of tables into the theorem prover PVS is described in m 
The aim there is to provide a special syntactic notation for describing different 
more general types of tables to aid user specification and to analyze tables for 
disjointness, coverage and consistency using PVS. Unlike our approach, table 
construct in P) translates into regular PVS constructs (cond,case) that perform 
case analyses on the table indices. The notion of dontcare in this paper is similar 
to blank entry described there. A specification of the SRT division quotient 
digit selection table is also given there. However, this description uses higher 
order features of PVS such as dependent types and no attempt is made there 
to exploit the structure of the table besides the dontcare entries in proving 
properties. Support for specifying and analyzing tables is provided by tools such 
as Tablewise p], SCR* jS! and consistency checker for RSML ^]. However, these 
tools are aimed at developing consistent and complete table specifications, and 
do not address the use of structure in tables while proving properties involving 
tables as done in this paper. 

2 Formalizing Finite Lookup Tables in RRL 

A finite n-dimensional table is a data structure indexed by an n-tuple; for every 
legal n-tuple of index values, it has an entry value. Some of the table entries may 
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be unspecified or dont cares. Figure 1 below has examples of two tables: a matrix 
and a table implementing a finite function. Without the proposed extension, 
there are at least two ways to define a finite table in RRL. 

1. Define a table as a function with the Cartesian product of index types as its 
domain and the entry type as its range. Each index tuple for which the table 
has an entry, the function is defined to be the entry. Otherwise, the function 
is partially specified. In that case, to use the cover-set induction method to 
generate cases in proofs I2DI, it is necessary to relativize the original formula 
using a formula specifying the subdomain on which the function is defined. 

2. Define a table as a predicate that is a subset of the Cartesian product of the 
index types and the entry type. Both index values for which there is a table 
entry as well as those index values for which there is no table entry would 
have to be given. Further, a lack of functional notational is likely to lead to 
cumbersome expression of properties involving a table. 
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Fig. 1. Diagonal and GCD4 Tables 



To avoid having to specify the index tuples to be excluded from a table 
specification, we define a table as a special data type with a functional notation 
for accessing table entries. Ideally, we would like a table to be input graphically to 
RRL as given in Figure 1. In the absence of that, we propose a simple mechanism 
using finite enumerated types for indices. 

A finite enumerated data type is a finite set of distinct values, typically 
denoted by a finite set of distinct free constructor symbols, i.e., every two distinct 
constructors are not equal. Such a data type en can be specified by listing its 
constructors as nullary constants of types en and declaring them to be free. 
Since finite subranges of natural numbers are often used for indices, a finite 
enumerated data type can also be specified as a subrange: enum en: [lo . . . 
hi: nat] where lo and hi are natural numbers with lo <= hi. Subranges over 
integers can also be used as shown below for the quotient digit selection table 
for SRT division. 

If the constructors of an enumerated type are given using numbers, then 
an implicit conversion from the values of the enumerated type to numbers is 
done so that the usual operations on numbers supported by RRL as a part of 
the quantifier-free theory of Presburger arithmetic can be used jOj. As will be 
evident below, for SRT division, such an implicit conversion is quite useful. In 
this sense, the constructor names used for enumerated types can be overloaded. 
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Using enumerated types, a parameterized data type construct table is in- 
troduced. Each instance of this construct gives a specific data type table. A 
table instance is specified by the name of the table, followed by a list of index 
types, each assumed to be an enumerated type, and a value type for table entries. 
This is followed by a list of all table entries; no order among the table entries is 
assumed. 

A parameterized (generic) function lookup is associated with table to access 
the entries of a specific table given the index values. We slightly abuse the 
notation and write lookup (t, il, jl) to mean the entry associated with the 
index values il, jl in the table t. For convenience, we introduce the syntactic 
sugar for lookup (t, il, jl) and write itastCil, jl).A table can then be 
specified by enumerating its entries as: t(il, jl) := vl, t(i2, j2) := v2, 
... . For example, the diagonal matrix dm can be specified as follows. Numbers 
0, 1, 2, 3 are free constructors of the enumerated type n03. The value type of the 
table is natural numbers, but only the subrange 0 — 4 is being used. 



[0, 1, 2, 3 : -> n03] 



table 


[ dm 


n03. 


n03 


-> nat ] 
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2.1 Specifying Tables with Dontcare Entries 

Many lookup tables in practice have dontcare entries, i.e., for certain index 
values, it does not really matter what the table entry is. This may be so either 
because table is not meant to be used for such index values, or the properties of 
interest involving the table do not depend upon the entry value for such index 
values. A table with dontcare entries is supported similar to a table without 
dontcare entries, with the difference that a special constant value dontcare is 
used as an entry value. Constant dontcare is the only value of a built-in sort 
called Dtcare, and the value type of such a table is a union of Dtcare and er, the 
type of other table entries. 

For example, in Figure 1, the table gcd, denoting the greatest common divisor 
function for the modular 4, is deliberately made for illustrative purpose to have 
a dontcare value (indicated by — in the table) when both of its arguments are 
0. The specification of this table is given below. 



table 


[gcd 


: n03, n03 


-> nat U 


)tcare ] 
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:= 3 



3 Mechanizing Reasoning about Finite Tables 

We now illustrate using a simple example how properties about tables are proved 
in the theorem prover RRL in the absence of the proposed approach. 
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Consider proving a simple property about dm. 

(Cl): dm{x,y) * z < (y+l)*z, 

where x and y are variables of the enumerated type n03, z is a natural number, 
* denotes multiplication over numbers. 

Without using the table data type, dm would have been partially specified 
as a binary function from natural numbers to natural numbers. So the above 
formula would not be meaningful without the conditions 0 < x,y < 3. The 
conjecture (Cl) will have to be relativized as: 

(Cl)': dm{x,y) * z < (y + 1) * z if (0 < a; < 3) A (0 < y < 3). 

The proof of (Cl) is attempted by exhaustive case analysis over the possibles 
values of x and y (4 each) leading to 16 cases. Each of the 16 cases can be 
easily proved by simplification (invoking the decision procedure for quantifier- 
free Presburger arithmetic in RRL for instance). For example, for a: = 0,y = 1, 
we have: 



dm(0,l)*z < (l-bl)*z if (0<0<3) A (0 < 1 < 3) 

The formula simplifies to true using the definitions of dm and * to 0 < 2 * z. 

The reader would notice that the case generated for the table entry dm{0, 2) 
is similar. As a matter of fact, all the cases generated corresponding to the table 
entry 0 would be established in the same way, irrespective of the values of the 
indices x and y. 

We introduce the notion of a sparse table in the next section following the 
terminology from matrix algebra. We show how sparsity in the tables (e.g. dm) 
can be exploited to identify common structure among different cases. 



3.1 Sparse Tables 

Definition 1 A table t is sparse iff there is at least one table entry t(ii, • • • , i„) 
such that the number of index tuples with that entry is at least > \t\/2, where |f| 
is the size oft. 

In a sparse table, thus, the majority of the entries have the same value. This 
entry value is called the most frequent entry value. The table dm is sparse with 
the most frequent entry value being 0. Similarly, the table gcd is also sparse with 
the most frequent entry value being 1. 



Mechanizing Reasoning abont Sparse Tables In many applications, while 
reasoning about sparse tables, the cases generated corresponding to the most 
frequent entry value are often proved in a similar fashion. The proofs usually 
follow due to the special properties of the most frequent entry value regardless 
of the corresponding values of the indices. This information can be exploited 
while attempting proofs. 
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From the conjecture and the most frequent entry value, properties that in- 
dices must satisfy can be generated. And such properties are often easier to prove 
than the original conjecture. As a result, it is not only easy to mechanically prove 
such a conjecture by case analysis on table entries instead of case analysis on 
indices, but even a hand-proof is likely to be simplified using this approach. 

Consider the above conjecture (Cl) about the diagonal matrix dm. 

(Cl): dm{x,y) * z < {y+l)*z. 

Let us attempt to prove based on the table entries in dm instead of index values 
X, y. There are 5 cases, instead of 16 cases due to different index values x and y. 
First, we attempt a proof for index tuples for which the table entry is the most 
frequent entry value 0. (Cl) reduces to (Cl.O) : Q * z < {y + 1) * z, with the 
implicit assumption that only those values of x and y for which dm{x, y) = 0 
must be considered. This formula simplifies to (Cl.O): 0 < (y-|-l)*z, which 
is true for natural numbers. Notice that different values of x and y for which the 
table entry is 0 need not be considered. 

Now consider the remaining entry values. For dm{x,y) = 1, (Cl) reduces to 
(Cl.l) : 1 * z < {y + 1) * z, which simplifies to (Cl.l) : z < {y + 1) * z, 

which reduces to true no matter what y is. 

For dm{x,y) = 2, (Cl) reduces to (Cl. 2) : 2 * z < {y + 1) * z, which 

simplifies to (Cl. 2) : z < y * z. This formula is not true unless y > 1. Since 

dm{x,y) = 2 implies that x = l,y = 1. So, this case also follows. 

The other two cases also follow as they constrain y to be 2 and 3. 

As this simple example illustrates, case analysis based on indices would have 
resulted in 16 cases for the above formula, whereas case analysis based on table 
entry values results in only 5 cases. 12 cases corresponding to the entry value 0 
are handled as one single case, thus recognizing a common proof structure. 

3.2 Deducing Constraints over Indices for Table Entries 

As suggested above, case analysis on table entry values often leads to a simpler 
version of the conjecture. For each table entry v, a proof of the simplified con- 
jecture can be attempted exhaustively, by substituting only those index values 
with the table entry v. Redundancy and duplication in proofs for different cases 
can be avoided this way. 

Another promising approach is to generate from the simplified conjecture, a 
constraint on index values that must be satisfied for the conjecture to be valid 
for a particular table entry v. It can then be checked whether the index values 
corresponding to v indeed satisfy the constraint. 

In the above example, for dm{x, y) = 2, the formula simplifies to Vz z < y*z. 
One possibility is to exhaustively check the conjecture for values of y for which 
dm{x, y) = 2. In case there are many values of x, y for which the table entry is 2, 
another way is to eliminate z from the above formula by quantifier elimination, 
which gives the constraint 1 < y on y, which is indeed true. 

The main idea in deriving constraints on index variables from a given con- 
jecture for a particular table entry value is that of projection of the values of 
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index variables. This can be obtained by eliminating non-index variables from 
the negation of the simplified formula by quantifier elimination. This is described 
below. 

Consider a universally quantified conjecture 4>{xi, • • • , j/i, • • • , ym) where 
xi, - ■ ■ ,Xn are the index variables and ?/i, • • • , ym are the nonindex variables, 
and (j) has occurrences of table terms indexed by xi, • • • , a;„. Without any loss of 
generality and for simplicity, we assume a single table term 
t{x\, • • • , Xn)- Consider a particular entry value, say v of t{x\, • • • , x„); let I be 
the finite set of index tuples (values of xi, • • • , Xn) for which the table t has entry 
value V. 

When (j) is attempted for the case t{x\, • • • , Xn) = v, a formula c/>i with t(xi, 

■ ■ ■ ,Xn) replaced by v is obtained from (j). Typically (f>i is simpler than (j). If it can 
be proved, we are done. In general, (j)i need not be valid, in which case, the goal is 
to find an equivalent quantifier-free formula ip{xi, • • • , Xn) without any nonindex 
variables such that for each index tuple satisfying ip, (pi is true for every value 
of the nonindex variables. And, this can be achieved by quantifier-elimination 
methods. 

Let be yi,---ym ^i(a:i, •••, x„, j/i, •••, y™). The formula 

9 characterizes the index tuples for which there is at least a tuple of values for 
nonindex variables that falsifies (p\. The corresponding index tuples constitute 
the complement of I. Let ipi{xi, • • • , Xn) be a quantifier-free formula equivalent 
to 2 / 1 , •• • 2/m (pii^i, • • • , x„, 2 / 1 , • • • , 2/m)- Then ip = -•ipi characterizes the set 
of index tuples such that for all values of nonindex variables, (pi is true. The 
formula ip{xi, - ■ ■ ,Xn) is the constraint on index variables for (p to be valid if 
t{xi, • • • , Xn) = V. If for every tuple in I, ip is true, then (p is valid for the case 
when t{xi, • • • , x„) = v. 

The above discussion gives an algorithm as well as sketches a correctness 
proof of the proposed algorithm. For eliminating nonindex variables, different 
techniques can be used. If the nonindex variables range over numbers, then 
Fourier’s elimination algorithm in RRL can be used. 

The above algorithm is illustrated on (Cl): 

(Cl): dm{x,y) * z < {y+l)*z. 

For dm{x,y) = 3, (Cl) simplifies to (Cl. 3) : 3 * z < {y+l)*z, which reduces 
to: (Cl. 3) : 2 * z < y * z. li must be shown that (Cl. 3) is true for every z 

on values of y for which dm{x,y) = 3. The variable z is the nonindex variable. 
Negating (Cl. 3) gives: 3z, 2 * z > y * z. Using z = 1, the quantifier can 
be eliminated giving 2 > y. The negation of the above formula 2 < y is the 
constraint on y, which is indeed the case for y such that dm{x, y) = 2. 

3.3 Tables with Dontcare Entries 

A dontcare entry in a table either denotes portions of the table that are not 
supposed to be accessed or a dontcare entry may be used like a wild-card in 
which case the validity of a conjecture involving the table does not depend 
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on the particular entry value. In both these situations, for a dontcare entry 
value, it must be ensured the index values for which the table entry is dontcare, 
the conjecture is valid independent of the table entry. This particular case is 
handled separately without replacing the table term by the dontcare value. This 
is illustrated using the property (C2) about the gcd function. 

(C2) : odd{gcd{x,y)) if odd{x) V odd{y). 

The case analysis is done based on different values of gcd{x,y). The case corre- 
sponding to the dontcare value in the gcd table is when x = 0,y = 0. For that 
case, (C2) simplifies to: 

(C2.d) : odd{gcd{0,0)) if odd{0) V odd{0), 

which is true as odd{0) reduces to false by the definition of odd. 

The other cases are proved by case analyses on entries as done before. Three 
cases are generated, corresponding to the entry values 1, 2 and 3 respectively. 

The reader should compare the simplicity, compactness and elegance of the 
above proof of (C2) with a proof based on exhaustive case analysis on x and y. 

3.4 Weakly Sparse Tables 

Lookup tables used in applications such as the SRT division and square root 
m have a different structure than sparse tables. There is no one single entry 
that is occurring most frequently in a table. Instead, the table entry is either the 
dontcare value or from a small subset of values. For example, the large quotient 
digit table in the SRT division algorithm has only 6 distinct values corresponding 
to the quotient digit that can arise in any iteration of the division algorithm: 
{— 2, — 1, 0, 1, 2} and a dontcare value. The notion of a weakly sparse table is 
introduced to characterize such tables. 

Definition 2 A table t, table [ t : el , e2 — > er] , is weakly sparse iff \entries{t)\ 
< minimum{\el\, |e2|), where entriesft) is the set of all table entries including 
the dontcare value, if used. 

The rationale behind this definition is that performing case analysis on entry 
values for a weakly sparse table does not result in more cases than would arise 
if the case analysis is done on any of the index variables. 

Even if a table is neither sparse nor weakly sparse, mechanizing proofs of 
properties involving tables based on entries may lead to fewer cases with simpler 
proofs. If the number of distinct table entries is much less than the total number 
of distinct index tuples, proof attempts using case analysis based on table entries 
can be helpful. Sometimes, it is also possible to use properties of index types 
as well for testing constraints on indices deduced from an instantiation of a 
given conjecture based on a particular table entry value. This is illustrated in 
the verification of the main invariant of the SRT division algorithm in the next 
section. 
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In the next section, we illustrate the proposed approach using the SRT divi- 
sion circuit. We first describe in detail the algorithm realized by the circuit, with 
a focus on the P-D plot and quotient digit selection table. This is followed by a 
brief review of the circuit. The specification of SRT division in RRL is partially 
reviewed, focusing on the quotient digit selection table. Finally, a correctness 
proof of the main invariant of the circuit is discussed based on the proposed 
approach. 

4 SRT Division 

The SRT division algorithm |iyil2lllT| is an iterative algorithm for dividing a 
normalized dividend by a normalized divisor in which the quotient is computed 
digit by digit by repeatedly subtracting the multiples of the divisor from the 
dividend. The algorithm can be formalized in terms of the following recurrences 
about division in base (radix) r. 

Po := dividend/ r, Qo := 0, 

Pj+i := r* Pj - qj+i * D, forj=0,---,n-l, 

Qj+i ■= r * Qj -I- qj+i, for j = 0, ■ • ■ , n - 1, 

where D is the positive divisor, Pj is the partial remainder at the beginning of 
the j-th iteration, and 0 < Pj < D, for all j, Qj is the quotient at the beginning 
of the iteration j, qj is the quotient digit at iteration j, n is the number of digits 
in the quotient, and r is radix used for representing numbers. The bounds on 
the successive partial remainder 0 < Pj < D guarantee the convergence of the 
algorithm. 

SRT dividers used in practice incorporate several performance enhancing 
techniques while realizing the above recurrences. In particular, it is necessary to 
minimize the number of iterations and efficiently i) compute the quotient digit 
in each iteration, ii) multiply divisor by the quotient digit, as well as partial 
remainders and quotients by radix, and in) perform subtraction/ addition. 

The SRT division algorithm in this paper uses the radix 4, and the quotient 
digits are represented by a redundant signed-digit representation with digits in 
the range [—2, 2]. Tradeoffs between speed, radix choice and redundancy of quo- 
tient digits are discussed in unj. Because of the redundancy, the bounds on the 
successive partial remainders for the convergence of the algorithm can be looser: 

— D * 2/3 < Pj < D * 2/3. 

By substituting the recurrence for the successive partial remainders, the range 
of shifted partial remainders, 4 * Pj, that allow a quotient digit k to be chosen 
is: 

[{k - 2/3)*D,{k + 2/3) *D]. 

The above relation between the shifted partial remainder range P and di- 
visor D is diagrammatically plotted as a P-D plot given in Figure 2. The plot 
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Shifted Partial 
Remainder 




Fig. 2. P-D Plot for Radix 4 



gives the shifted partial remainder ranges in which a quotient digit can be se- 
lected, without violating the bounds on the next partial remainder. For example, 
when the partial remainder is in the range [5/3D, 8/3Z3], the quotient digit 2 
is selected. The shaded regions represent quotient digits overlaps where more 
than one quotient digits selection is feasible. So if the partial remainder is in the 
range [4/3Z?, 5/3Z?], either 2 or 1 can be used. Due to the overlap between the 
lower bound for the P/ D ratio for quotient digit k and the upper bound for the 
quotient digit A: — 1, P/D ratio can be approximated in choosing quotient digits. 

Redundancy in quotient digits allows the quotient digit to be selected based 
on only a few significant bits of the partial remainder and the divisor. As ex- 
plained in for a radix 4 SRT divider with the partial remainders and divisor 
of arbitrary width n, n > 8, it suffices to consider partial remainders up to 7 bits 
of accuracy and a divisor up to 4 bits of accuracy. This reduces the complexity 
of the quotient selection process, as it can be implemented as a finite table, and 
the partial remainder computation can be overlapped with the quotient digit 
selection computation. 

The quotient digit selection table implementing the P-D plot for radix 4 is 
reproduced above from H31. Rows are indexed by the shifted truncated partial 
remainder g7g6gbg4:.g8g2gl (represented in 2’s complement); columns are in- 
dexed by the truncated divisor /1./2/3/4; table entries are the quotient digits. 
The table is compressed by considering only row indices up to 5 bits since only 
a few entries in the table depend upon the 2 least significant bits g2gl of the 
shifted partial remainder. For those cases, the table entries are symbolic values 
A, B, C, D, E are: 
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parrem 

g 7 g 6 g 5 g 4 . g 3 g 2 gl 


Divisor 

fl . f 2 f 3 f 4 


1.000 


1.001 


1.010 


1.011 


1.100 


1.101 


1.110 


1.111 


1010.0 


















1010.1 


- 


- 


- 


- 


- 


- 


-2 


-2 


1011.0 


- 


- 


- 


- 


- 


-2 


-2 


-2 


1011.1 


- 


- 


- 


-2 


-2 


-2 


-2 


-2 


1100.0 


- 


- 


-2 


-2 


-2 


-2 


-2 


-2 


1100.1 


-2 


-2 


-2 


-2 


-2 


-2 


-2 


-2 


1101.0 


-2 


-2 


-2 


-2 


-2 


-2 


B 


-1 


1101.1 


-2 


-2 


-2 


B 


-1 


-1 


-1 


-1 


1110,0 


A 


B 


-1 


-1 


-1 


-1 


-1 


-1 


1110.1 


-1 


-1 


-1 


-1 


-1 


-1 


-1 


-1 


1111.0 


-1 


-1 


D 


D 


0 


0 


0 


0 


1111,1 


0 


0 


0 


0 


0 


0 


0 


0 


0000.0 


0 


0 


0 


0 


0 


0 


0 


0 


0000.1 


1 


1 


1 


1 


E 


0 


0 


0 


0001.0 


1 


1 


1 


1 


1 


1 


1 


1 


0001.1 


2 


c 


1 


1 


1 


1 


1 


1 


0010.0 


2 


2 


2 


2 


C 


1 


1 


1 


0010.1 


2 


2 


2 


2 


2 


2 


2 


1 


0011.0 


- 


2 


2 


2 


2 


2 


2 


2 


0011.1 


- 


- 


2 


2 


2 


2 


2 


2 


0100.0 


- 


- 


- 


- 


2 


2 


2 


2 


0100.1 


- 


- 


- 


- 


- 


2 


2 


2 


0101.0 
















2 


0101.1 



















Table 1. Quotient Digit Selection Table 



A = -(2-g2*gl), B = ~{2-g2), C = l + g2, D = -l + g2, E = g2. 

Every entry in the table is thus for four remainder estimates. The - entries in 
the table are the dontcare entries. 

In the above recurrence relations, qj+i is replaced by qtable (up , ud) , where 
qtable is the quotient selection table, up, ud are, respectively, the truncated 
partial remainder and divisor. 



4.1 SRT Divider Circuit 

A radix 4 SRT divider circuit based on the above quotient digit selection table 
is described in Figure 3. The registers divisor, remainder in the circuit hold the 
value of the divisor and the successive partial remainders respectively. The reg- 
ister q holds the selected quotient digit along with its sign; the registers QPOS 
and QNEG hold the positive and negative quotient digits of the quotient. A 
multiplexor MUX is used to generate the correct multiple of the divisor based 
on the selected quotient digit by appropriately shifting the divisor. The hard- 
ware component QUO LOGIC stands for the quotient selection table, and it 
is typically implemented using an array of preprogrammed read-only-memory. 
The hardware component DALU is a full width ALU that computes the partial 
remainder at each iteration. The component GALU (the guess ALU ^2]) is an 
8-bit ALU that computes the approximate 8-bit partial remainder to be used 
for quotient selection. The components < < 2 perform left shift by 4. 

The circuit is initialized by loading dividend/4 (by right shifting the dividend 
by 2 bits) and the divisor into the remainder and divisor registers. The quotient 
is initialized to be zero by setting the registers QPOS and QNEG to be zero. 
The quotient digit register q is initialized by the appropriate alignment of the 
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dividend and the divisor. At each iteration, the correct multiple of the quotient 
digit and the divisor is output by MUX. This output and the partial remainder in 
the remainder register are input to D ALU to compute the next partial remainder. 
An 8 bit estimate of the partial remainder in the remainder register and an 8 
bit estimate of the output of the MUX are input to the GALU. 

GALU computes an 8 bit estimate of the next partial remainder which is left 
shifted by 4, and then used with the truncated divisor (dl) to index into QUO 
LOGIG to select the quotient digit for the next iteration. Note that GALU and 
the quotient digit selection are done in parallel with the full width DALU so 
that the correct quotient digit value is already available in the register q at the 
beginning of each iteration. 




Fig. 3. SRT Division Circuit using Radix 4 



4.2 Specifying Quotient Digit Selection Table in RRL 

The quotient digit selection table is specified in RRL by qtable as an instance 
of the parameterized table type. The table indices are given by the integer sub- 
ranges column and row. The entry type of qtable is the union of integers and 
Dtcare, but only the subrange [m(2) ... 2] is used. The unary function m is the 
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minus operation on integers Q- As the table is too big to be included here, we 
give a partial specification of one of the rows-the eighth row. 

The eighth row corresponds to four shifted truncated remainder estimates: 
{ — 17/8,— 9/4, —19/8, —5/2} depending upon the values of g2gl; they are scaled 
up by multiplying by 8, to {-17,-18, —19, —20} (2’s complement is used in Table 
1 for row indices). Below, the table entries for row index 20 are given. 



[8 ... 15 : column] 
table [qtable : row 
qtable(m(20) ,8) := 

qtable (m(20) , 10) := 
qtable (m(20) , 12) := 
qtable (m(20) , 14) := 



[m(48) ... 47 : row] 



column -> integer U Dtcare] 



m(2). 


qtable (m (20) ,9) 


:= m(2). 


m(2). 


qtable (m(20) , 11) 


:= m(2) , 


m(l) , 


qtable (m(20) , 13) 


:= m(l) , 


m(l). 


qtable (m(20) , 15) 


:= m(l). 



The table entry for the eighth row and the column index 1.011 (11) is B, where 
B = —(2—g2). For all other column indices, the entries do not depend upon g2gl. 
So for all column indices other than 11, the table value is the same irrespective 
of whether the row index is —20, —19, —18 or —17. 

For the column index 11, the table entries are however different: it is -2 if 
the row index is —20 or —19, since in that case g2 is 0; if the row index is —18 
or —17, then the table entry is -1. 



qtable(m(19) , 11) := m(2) qtable (m(18) , 11) := m(l) qtable (m(17) , 11) := m(l) 



Other rows are similarly specified, with each row defining 32 table entries. 



4.3 Verifying Boundedness of Partial Remainders 

The main invariant of SRT division is specified in RRL using qtable as: 

(C3) : m(2) * divsr <= 

12 * parrem - 3 * qtable (up, ud) * divsr <= 2 * divsr if 

m(2) * divsr <= 3 * parrem <= 2 * divsr and 

ud <= 8 * divsr < ud +1 and up <= 32 * parrem < up + 1 . 

The above formula states that if the partial remainder parrem in the previ- 
ous iteration is within bounds of the divisor divsr (the absolute value of the 
partial remainder is within two-thirds of the divisor) and if the table indices up, 
ud correctly approximate the divisor and the partial remainder within certain 
bounds, then the partial remainder computed in the next iteration 4 * parrem - 
qtable (up , ud) * divsr would continue being appropriately bounded by the 
divisor. 

In [iSI 1 1 )j . we reported two different methods for proving this invariant. In ma, 
(C3) was automatically proved in RRL by modeling quotient selection table as 

^ Instead of using fractional numbers for indices, it is more convenient and faster 
for RRL to use their scaled integer versions as indices to the table. So all row and 
column indices are scaled up by 8. Scaling up effectively leads to using number 
representations of bit vectors of the shifted truncated partial remainder estimate 
and the truncated divisor estimate by dropping the decimal point. 
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a function over integers, and by performing case analysis on the table indices up 
and ud. This leads to 1536 cases, 768 cases each for proving the upper bound 
and lower bound, respectively. Dontcare entries are modeled by out-of-bound 
integers, and the intermediate cases generated are extremely cumbersome. In 
0, the proof was done using an intensional formulation of the quotient table 
by abstracting table entries in terms of boundary value predicates proposed in 
fp. This approach requires user guidance in terms of additional lemmas besides 
the manual abstraction of the table. Establishing the correctness of this manual 
abstraction is nontrivial. 

We discuss below how a simple correctness proof of (C3) can be done based 
on the proposed approach that avoids many of these problems. 



4.4 Correctness Proof 

The correctness proof of (C3) is done by case analyses on table entry values 
rather on the indices. For the lower bound, this leads to 6 top level cases-5 
corresponding to the entry values in the subrange [m(2) . . . 2] , and one case is 
generated for the dontcare entry value. Six cases are generated for the upper 
bound as well. 



The Case of Quotient Digit 0 For qtable(up, ud) = 0, (C3) simplifies to 

(C3.0): (-divsr) <= (6 * parrem) <= divsr if 

(-2 * divsr) <= (3 * parrem) <= (2 * divsr) and 
(ud <= (8 * divsr) < (ud +1)) and 

(up <= (32 * parrem) < (up + 1)). 

Consider the subcase of this simplified formula to show - divsr <= 6 * parrem. 
This formula could have been verified using different values of up and ud for which 
the qtable gives 0. Instead, we illustrate how quantifier elimination can be used 
to derive a simple constraint on up and ud based on the algorithm discussed in 
subsection 3.2. This constraint can be checked more easily. 

The negated formula is: 

(Exists, parrem, divsr) [ 

(6 * parrem < -divsr) and (-2 * divsr <= 3 * parrem) and 
(3 * parrem <= 2 * divsr) and 

(ud <= (8 * divsr) and (8 * divsr < ud + 1) and 
(up <= 32 * parrem) and (32 * parrem < up + 1)] 

The non-index variable parrem can be eliminated by one Fourier elimination 
step PI by cross-multiplying the coefficients of parrem. The resulting formula is 
6 * up <= -32 * divsr. One more Fourier step on this formula using ud <= 8 
* divsr eliminates the remaining nonindex variable divsr to give 48 * up < 
-32 * ud. The constraint on indices is generated by negating and simplifying 
this formula: 



(0, Lowerbound) : 



up >= -2/3 ud. 



(I) 
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For second subcase corresponding to the the upper bound 6 * parrem <= 
divsr, the nonindex variables divsr, parrem are eliminated from the negated 
formula using Fourier steps, leading to the constraint 

(0, Upperbound) : up + 1 <= 2/3 ud. (II) 



Constraints on Indices for Other Qnotient Digits The constraints for 
qtable (up, ud) = 1 and qtahle (up, ud) = 2 are similarly derived. They are given 
below. 



(1, Lowerbound) : up >= 1/3 (ud + 1) (III) 

(1, Upperbound): up + 1 <= 5/3 * ud (IV) 

(2, Lowerbound): up >= 4/3 (ud + 1) (V) 

The constraints on the indices for the lower and upper bounds for the table 
entries m(2) and m(l) can be similarly calculated. These lead to 3 additional 
constraints on the indices. One additional case is generated for the table entry 
dontcare. All of these 9 cases can then be established by case analyses over the 
index values. 

Note that there is no constraint on indices deduced from the upper bound 
subcase for the entry value 2. Thus the upper bound holds for all values of up 
and ud. This is evident from the P-D Plot which shows that the maximum value 
of up is 8/3 ud for choosing the quotient digit 2 and the hypothesis in (C3) 
ensure that this is always the case. 

The validity of the invariant (C3) is reduced to showing that the above 9 
constraints on indices up, ud are satisfied for different quotient digit values. 
These constraints can be checked exhaustively by explicitly plugging in various 
values of up, ud which give rise to each of quotient digits. Below, we show how 
structure about the indices can be exploited further to check these constraints 
without having to explicitly substitute values of up and ud. 

Since the above constraints are simple inequalities, and indices are subranges 
over numbers, this information can be used to simplify this check as illustrated 
below. 

Consider constraint (I): up >= - 2/3 ud. For qtable (up, ud) = 0, ud 
ranges over [8. ..15], meaning -2 * ud is in the range [m(30), m(16)]. The 
constraint is satisfied for all values of up greater than or equal to m(5). The 
remaining values of up , ud to be considered are: 

[(m(6), 10),..., (m(6), 15) , (m(7) , 12) (m(7) , 15),..., 

(m(8), 12),..., (m(8), 15).] 

For up = m(6), ud is between 10 and 15, so 9 <= ud as per the constraint. 
Similarly, when up = m (7) or up = m (8) , ud is between 12 and 15; the constraint 
is satisfied in both cases. So all cases are considered. The index domain structure 
can thus be exploited for further simplifying the checking these constraints. 

A similar analysis works for constraint (II) and other quotient digits, in- 
cluding the dontcare entry. 
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5 Implementation Status 

The proposed approach has been integrated into the theorem prover RRL. The 
prover provides support for specifying and reasoning about finite tables as a 
special data type. Sparsity information about the input tables is automatically 
computed by RRL based on the table dimensions and the frequency of occurrence 
of the entries. This analysis is integrated with RRL’s heuristics to reason about 
conjectures expressed using tables. 

If a table-expression appears in a conjecture, and the associated table is 
sparse, case analyses based on table entries is invoked. Otherwise, induction 
based on table indices is done in a usual way. A table definition is preprocessed 
by RRL to generate a table cover set, which is a map from table entries to 
a set of index tuples. For each table entry, there is a top level case generated by 
replacing the table-expression in the conjecture by the table entry. The resulting 
formula is simplified using rewriting and the decision procedures of RRL. If the 
validity of the case does not depend upon indices, such as the main invariant (03) 
of SRT division for the dontcare values, simplification would typically prove it. 
Disproving any top level case leads to the conjecture being disproved. Otherwise, 
projection is attempted to eliminate the nonindex variables from the formula. If 
projection is unsuccessful for a top level case, i.e, it is not possible to generate 
a formula all of whose variables are table indices only, then entry based case 
analyses is abandoned with RRL attempting induction based on indices. 

Projection is currently implemented only for tables with numeric indices and 
entries. We have modified the linear arithmetic decision procedure implemented 
in RRL to do projection. The procedure has been changed to be invoked with a 
set of variables to be eliminated from the input formula. Variables are eliminated 
one at a time using Fourier’s elimination procedure. 

Case analysis based on table entry values for attempting a conjecture reduces, 
in the worst case, to case analysis based on index values. So there is not much 
additional overhead, but advantages can be significant in case of sparse and 
weakly sparse tables. However, the current implementation, does not reuse any 
part of a failed proof attempt based on case analyses of entries. An entry based 
case analyses might succeed in establishing the conjecture for most of the entry 
values and may fail only for a few entry values. In such cases, it suffices to 
consider the indices corresponding to these failing entry values and perform 
explicit case analyses based on indices. As of now we perform the case analyses 
for all index values. 

The preliminary implementation has been successfully used to prove the main 
invariant of SRT division and other examples. These experiments have led us 
to focus on several interesting performance enhancements in RRL. Currently, 
the intermediate coefficients generated by projection can become very large, and 
this adversely affects the performance of entry based case analyses, especially 
since numbers are represented in unary in RRL. A substantial speed-up in per- 
formance will result if arithmetic calculations are directly supported in RRL. 
Other optimizations to keep the numbers smaller, such as dividing structure of 
tables-their entries as well as indices. 
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6 Explicit vs. Abstracted Quotient Digit Selection Tables 



SRT division has also been mechanically verified using Analytica P and PVS 
pr7| . A major difference between these verification efforts and the proposed 
approach is in the representation of the quotient digit selection table. These 
approaches use an abstracted version of the table whereas an explicit and ex- 
haustive quotient digit selection table is used by the proposed approach. 

The main feature of the radix 4 SRT divider proof in P was an abstraction of 
the quotient selection table using boundary value predicates. From the quotient 
selection table in ra, an intensional specification of the table is developed that 
only considers the minimum and maximum values of a partial remainder for every 
quotient digit. Nothing is specified about other intermediate partial remainder 
values for a quotient digit and the invariant is not checked for these values. It 
is, therefore, possible to certify erroneous quotient digit selection tables correct 
using this approach unless the abstraction is proved correct. As stated in P, the 
primary reason for using an abstracted table is to reduce the number of cases in 
the correctness proof. 

A radix 4 SRT divider based on an abstract table similar to the one used in 
P was automatically verified using RRL p. This was primarily an exercise in 
determining how much of the correctness proof in P could be automated in RRL 
without using any symbolic computation algorithms of computer algebra systems 
used in p. Much to our pleasant surprise, we found that no extensions had to 
be made to RRL. RRL was able to find proofs of all the formulas automatically, 
without any interaction. 

In order to avoid the potential gaps in the correctness proofs due to an 
abstract table, we subsequently formalized the SRT divider using an explicit and 
exhaustive representation of the quotient digit selection table with 768 entries 
m- Such a table can be obtained by a direct translation of a hardware PLA 
(programmable logic array), realizing the quotient selection logic of a commercial 
SRT divider. The SRT divider based on this explicit table was fully automatically 
verified in a push-button mode in RRL. Furthermore, the proofs could be done 
easily and quickly using RRL compared to the proofs based on the intensional 
specification of the table, even though a lot more cases had to be considered; 
the instantiated conjectures could be easily established by RRL. Details of these 
proofs can be found in HD!. 

The proofs of most of the subcases generated by RRL based on the explicit 
quotient digit selection table share a common structure. In this paper, we have 
shown how this commonality in proofs can be automatically detected by the 
theorem prover RRL, and used to further simplify the correctness proofs. The 
proof described in this paper is based on an explicit representation of quotient 
digit selection table unlike those in m- No manual abstraction is done. The 
entry based case analyses considers all possible values of partial remainders while 
verifying the invariant, and not just the boundary values as done in jlbUj . It is 
encouraging that by exploiting simple characteristics of tables such as sparsity, a 
comprehensive correctness proof simpler than the one based on manual abstrac- 
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tion 1 1 lisj can be automatically obtained, while eliminating the scope of errors 
introduced by manual abstraction. 

The proofs reported in HSI using the PVS system are more general. The spec- 
ification and the proof are organized manually using sophisticated mechanisms of 
the P VIS' language which supports higher-order logic, dependent types, overload- 
ing, module facility, a special data type table [IB|. Miner and Leathrum’s work 
0 is a further generalization of the proof in [TB|. Their paper mentions analy- 
sis of three different SRT lookup tables using PVS. The URL (f tp : //airl6 . 
larc.nasa.gov/ pub/fm/ larc/fp_div/ dump_ieee_srt_divide) has the PVS 
proof dump of a radix 4 SRT divider. The quotient selection table used there is 
based on m- The table specification uses dependent subtypes to constrain the 
index values. Not all possible values of the table indices are explicitly enumer- 
ated in the table. Only those values of the partial remainder are considered for 
which there is at least one divider value for which a legal quotient digit can be 
chosen, i.e., the first and the last rows of Table 1 in this paper are not enumer- 
ated. Further, the table is not explicit in that it uses 5 bits to approximate the 
partial remainder in most of the cases, and uses 7 bits only for certain boundary 
cases as described in H31. Each entry of this table is expanded into 4 entries of 
the explicit table as described in this paper. 

The PVS dump has 54 top level cases. 33 additional proof obligations are 
generated to discharge the assumptions associated with the dependent types. 
Unlike the correctness proofs in PJ,?] all possible values of the partial remainder 
and divisor are considered. However, the specification is developed with consid- 
erable human ingenuity, and the resulting proof is manually driven, even though 
parts of the proof can be done automatically using previously developed PVS 
tactics. 

In contrast, we have focussed on an explicit table specification that can be 
easily generated from commonly used representations of the quotient tables in 
practice. We have then described how simple heuristics can be built into the 
prover to eliminate unnecessary and tedious aspects of correctness proofs based 
on the explicit representation. 

7 Concluding Remarks 

The idea of performing case analysis based on table entries when the number 
of such entries is much smaller than the table size, seems pretty obvious. But 
somehow, it has not been emphasized in the automated deduction literature. No 
special heuristics for reasoning about tables implementing finite functions have 
been studied. 

In this paper, we have emphasized exploiting the entries in a table for doing 
case analysis in mechanizing proof attempts of conjecture involving the table. 
The number of cases generated by the proposed approach is the same as that 
based on case analyses on indices in the worst case. Given, indices drawn from 
domains with cardinality di and d, 2 , the total number of cases generated by index 
based case analyses is di * d 2 - In the proposed approach, for any entry u, even 



Mechanizing Reasoning about Large Finite Tables 



41 



when the index values need to be explicitly enumerated, only those values of the 
indices i and j are considered where tahle{i,j) = v. Thus, in the worst case, the 
overall cases considered is no worse that done by index based case analyses. 

The effectiveness of the proposed approach is demonstrated on a nontrivial 
example of the radix 4 SRT division circuit which uses a weakly sparse quotient 
digit selection table. We believe that other aspects of table structure can be 
effectively exploited as well for attempting a conjecture about a table, particu- 
larly ordering and relationship between indices as well as any possible functional 
relationship among indices and the table entries. This was demonstrated for ver- 
ifying constraints on indices deduced by quantifier elimination from simplified 
conjectures for particular table entries. 

We plan on attempting proofs of properties of other circuits implemented 
using large tables using RRL which is likely to be helpful in developing better 
and additional heuristics for reasoning about finite tables. Digit-serial arithmetic 
circuits including square root and elementary functions such as exponential, 
logarithms also use lookup tables. The tables used for the square root function 
exhibit sparsity much like division. 

The digit-serial realizations of elementary functions are also specified by re- 
currences similar to the used for division (im pp. 378-434.) The lookup tables 
typically used in realizing these functions contain precomputed values of terms 
in the recurrence that cannot be directly realized in hardware. These tables do 
not exhibit sparsity, but, the entries in such a table are typically related by 
a constant additive or multiplicative factor, which can be used to simplify the 
proofs. 

These recurrences specify how the partial result should extended at each step 
by a bit based on the range of values to which the partial result computed so far 
belongs. The convergence of these algorithms is established based on the partial 
result always being within certain bounds at each step. The choice of the bit at 
each step is done by having a lookup table with only a few leading bits of the 
partial result. Sparsity considerations appear to be applicable to such lookup 
tables. 

The use of simple heuristics for handling large tables facilitates direct and 
explicit specification of tables used in circuit implementations, instead of man- 
ually abstracting them using predicates. The use of direct and explicit tables 
has several advantages. It avoids the errors that may be introduced by abstrac- 
tions. Additional proof obligations for proving abstractions correct are avoided. 
Moreover, the gap between implementation and verification models of circuits 
is reduced. The main drawback of using explicit tables however, is the large 
number of cases that are generated in the correctness proofs, implementation 
and the verification The proposed approach addresses these concerns by firstly 
avoiding abstractions by directly considering tables as they are, and secondly by 
exploiting the structure of tables-their entries as well as indices. 
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Abstract. A functional language Ala is given. A sub-set of Ala 
is automatically typable. The types are formulas of Intuitionistic Light 
AfSne Logic with polymorphism a la ML. Every term of AJa can reduce 
to its normal form in, at most, poly-steps. AJa can be used as a prototype 
of programming language for P-TIME algorithms. 

1 Introduction 

In PI, Girard introduced Light Linear Logic which captures P-TIME. This 
means that the cut-elimination process of Light Linear Logic terminates in poly- 
nomial time with respect to the dimension of any given derivation, and, vice 
versa, all P-TIME Turing machines can be encoded as data-types in Light 
Linear Logic. 

Girard left as an open problem to find a concrete syntax for ILLL, namely for 
Intuitionistic Light Linear Logic. This paper introduces an untyped functional 
language Ala which has a typable sub-set AJa- The types for AJa are formulas 
of ILLL with a polymorphism a la ML. The types can be inferred automatically. 

Before introducing A^ai it is worth recalling the main mechanism of ILLL to 
bound the cut-elimination complexity. The key point is avoiding the proliferation 
of the contraction rule 

{Contraction) . 

that can take place when eliminating the cuts in a derivation of Intuitionistic 
Logic. For example, let C„ be the typable A-term 22 ... , which has n > 2 Ghurch 

d©f 

Numerals 2 = Xxy.x{xy) in it. The length of the left-most reduction of Cn to 
its normal form growths exponentially in n. This happens essentially because 
there are redexes that, once reduced, yield two residual redexes each, thus de- 
veloping a reduction space with the form of a tree. Logically, this means that 
there are contraction rules that duplicate other contraction rules as effect of the 
cut-elimination. Figure Q gives an intuitive pictorial representation of this. 

The non-exponential proliferation of the c-nodes in ILLL is a consequence of 
two main aspects. First, in ILLL, a derivation can be duplicated only if enclosed 
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Fig. 1. From lists to trees of contractions 



into a region, called !-box. Second, any !-box can derive a formula, from at most 
a single assumption. For example, let iT be a derivation of ILLL, proving B 
from the single assumption A. Then, the !-box \U, built on U is: 
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A cut-elimination step duplicating III is such that: 




The !-box divides the space in two parts: one inside, and the one outside it. This 
means that, unlike Intuitionistic Logic, the c-node here above, which contracts 
lA, does not trivially extend any tree of c-nodes inside the !-box: there is the !- 
box border in between. In first approximation, this means that any exponentially 
growing tree of c-nodes, in a derivation U of ILLL, is an (exponential) function 
of the maximal number of nested !-boxes of 77. But, this is a detail... 

The second feature of ILLL is that any !-box has at most one assumption. 
This means that, if at some stage of the cut-elimination process, a !-box has a 
tree of c-nodes in it, then every c-node was present somewhere, in the same !- 
box, since the beginning of the cut-elimination. In this way, the non-exponential 
proliferation of c-nodes holds when composing the deductions of ILLL. 

We complete the description of the novel part of ILLL by describing the 
behavior of a second region in it. If 77 is a deduction proving B from two, 
possibly empty, sets of assumptions Ai . . . A„ and A[ . . . A'^, then, the §-box §77 
with 77 in it, is: 
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The §-boxes increase the expressiveness of ILLL. Without §-boxes, Church Nu- 
merals could not be encoded in ILLL. The presence of an arbitrary number 
of assumptions of a §-box that can be contracted should not worry. The non- 
exponential proliferation of c-nodes is preserved because §-boxes themselves can 
not be duplicated. Hence, the creation of trees with too many c-nodes below any 
§-box is forbidden. 

The computational behavior of the language incorporates the features 
here above of the cut-elimination of ILLL. The language HJa) however, is the 
functional counterpart of a natural deduction for the sequent calculus that As- 
perti introduced in where the original sequent calculus in |3| has greatly 
simplified. The point of is to move from ILLL to its affine version ILAL, 
where unrestricted weakening is allowed. 



Contributions. This work introduces an untyped functional language Ala- It 
has a sub-set HJa of typable elements. The types for AJa are the formulas of a 
natural deduction that we introduce. The natural deduction proves a sub-set of 
the formulas derived by Asperti’s sequent calculus for Intuitionistic Light Affine 
Logic ILAL p. The types have a polymorphism a la ML 0: only external 
quantifiers are allowed. The types for the elements of AJa can be automatically 
inferred by a type inference algorithm. In particular, the type r inferred for any 
M S A^a is principal. 

In spite Asperti’s sequent calculus for ILAL greatly simplifies the original 
system for Light Linear Logic in |3|, the natural deduction it induces still has 
enough rules to make a concrete functional syntax quite heavy. Following a 
methodological hint already in the following simplification is introduced: 
contraction is left as an implicit structural rule of the natural deduction. This 
choice influences the design of Aj^ as follows: Aj^ must be defined on two 
disjoint sets of variables names: one contains the names for the terms that can 
be duplicated during the computations. The other set contains the names for 
linearly used terms. This makes A^a a sort of call by value language: the two 
kinds of variables “decide” what can be duplicated, and when. 

The language AJa is strongly normalizable and Church- Rosser. It is also 
correct with respect to P-TIME. Namely, if M G AJa, then M represents an 
algorithm in P-TIME. Moreover, AJa admits a poly-step reduction strategy. 
This means that any term M of AJa can be reduced to its normal form in, at 
most, a number of steps polynomial in the dimension of M . 

The completeness of A^a with respect to P-TIME is left open. 

“Index”. Section El introduces the untyped language Ala- Section EI defines the 
natural deduction for ILAL, decorated with the terms of AJa- Section E]is about 
the correctness of Ala with respect to P-TIME. Section 0 recalls the expres- 
siveness of Ala 3’’^c 1 shows some encodings of usual A-Calculus terms in it. When 
reading SectionO, and0 we suggest to refer to Section0for having programming 
examples on AJa- Section El introduces the poly-step strategy. Section 0 defines 
the type inference algorithm in natural deduction style. Section 0 concludes the 
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paper with some reference to related, and future work. Appendix El introduces 
some of the details about Gla that Asperti skipped in p. 

Acknowledgments. This work has been developed also with some useful dis- 
cussions with Andrea Asperti, Stefano Guerrini, Tom Kranz, Yves Lafont. The 
work has been supported by a TMR-Marie Curie Grant, contract n. ERBFM- 
BICT961411. 

2 The Functional Language 

The Syntax. Let Term-Variables be a set of identifiers ranged over by x, y, w, z, 
and 1-Term-Variables be another set of identifiers ranged over by X,Y,W, Z . 
Moreover, let x be ranging over Term-Variables U 1-Term-Variables. The set 
A of the functional terms is given by: 

AT, A, P, Q Term-Variables | !-Term-Variables | \\.M \ (MN) \ \M \ 

I §M I I letA^MinA 

The term constructors A, !,§, and let bind free variables. As usual, A is such 
that, if X is in the free variable set FV(Af) of M, then it can not be in FV(Ax-Af). 
The term constructor ! can be applied either to a closed term M, yielding the 
closed !-box IM, or to an open term M with a single variable In this case, the 
free variables of the !-box obtained are in FV(A). In particular, N is called the 
interface of the !-box just built, and M is its body. The operator § builds §-boxes. 
It can be applied to a term M with free variables Xij ’ ’ ’ i Xn, being n > 1. All 
Xi j ■ ■ ■ j Xn get bounded, and the free variables of the obtained §-boxes are in 
UjLiFV(Mi). Again, all M^s are the interface of the §-box just built, and M is 
its body. Finally, if A € FV(A), then let binds X. The square brackets in ! and 
§-boxes belong to the syntax, and delimit the interface of the boxes themselves. 

The substitution of M for x in A is denoted by It is defined as a 

partial function. Namely, it behaves like the usual variable-clash free substitution 
only in one of the two following cases: 

— M is either a !-box, or in 1-Term-Variables, and x is in 1-Term-Variables, 

— M is any term, and x belongs to Term-Variables, 

Otherwise, the substitution is undefined. 

The elements of A are considered up to a-equi valence. For example, ! (M) 
is a-equivalent to !(M{^ Parenthesis will be omitted when writing 

terms, if no ambiguity exists. 

The Dynamics. The rewriting system on A is the contextual closure of the 
relation > on Ala x Ala here below: 

• /3-group 

{Xx.M)N >1 
(\X.M)Y >2 M{^ix} 

{\X.M)\N >3 
{\X.M)\(N)[^ /^] >4 

[\X.M)\[N)f /^] >5 if P ^ !-Term-Variables 
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...] 


>5 






. . !(!(V)T/x])['5/„/] 
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• §§-group 

>1 §(Mn.,})[ ] 

§(M)[. . . . . . ] §(Mn..})[ ] 

•••] >3 

§(M)[---§’^/x.---] > 4 §(M{'^ixJ)[ ] 

§(M)[. • • • • • ] >5 §(M{'Wl^/^Ux.})[- • ■ ■■■] 

§(M{K^)r/.i^xj)[ . 

• let -group 

let A = y in P >1 P^lx} 
let A =!M in P >2 P{'"ix} 
let A /^] in P >3 let A = A in 



Of course, the a-equivalence must be used to avoid variable clashes when 
rewriting terms. As usual, is the reflexive, and transitive closure of-^ on A. 
The functional language Ala, subject of this work, is (A,-^). 



Discussion. It is worth giving some intuition about the meaning of the dynamics. 

Ala is a kind of restriction of the untyped call- by- value A-Calculus|0|, which 
rewriting rule is: 



{Xx.M)N M{^loc} if A is a value , 

where the variables, and the A-abstractions are values. Namely, only the terms 
with a specific form can be substituted for the variables. The rewriting system 
behaves analogously to —>/?„. Following the definition of the partial substi- 
tution of terms for variables, given above, no constraints exist when replacing 
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X S Term-Variables by any term. The idea is that, in the typable sub-set 
X stands for any non duplicable, or linear, entity. Consequently, replacing 
M for X can not result in the duplication of M. On the contrary, only the !- 
boxes and the elements of 1-Term-Variables can be substituted for a variable 
X € !-Term-Variables. This because, in the typable sub-set tIla) ^ represents 
duplicable, or non linear, resources. In the usual call- by- value terminology, any 
term is a value, with respect to the linear variables. On the other side, only the 
!-boxes, and the 1-Term-Variables, which represent the duplicable regions of 
ILLL, are values for the non linear variables. 

Take the / 3 -group, for example. The first four axioms follow what just said 
here above. The axiom >5 needs a side condition to take it apart from >4. In 
particular, >5 serves to avoid the substitution for X of the interface P, as it 
could also not be a !-box. 

As a second example, consider the !!-group. The relation defined by the !!- 
rules makes two terms communicating, when such two terms are contained in 
two distinct !-boxes. The communication takes place by substituting the term 
contained in one !-box for the free variables of the term contained in the other 
!-box. The rule >1 deals with the case where one term is in a !-box which con- 
stitutes the interface of another !-box, whose content is M. The communication 
between N and M can take place independently from the form of N, accordingly 
to what said above. Otherwise, N must reduce to a further, deeper !-box, before 
the substitution takes place: see the rule >4. The remaining !!-rules cover all the 
possible disjoint cases, according to the form of the !-box in the interface. 

All the other groups preserve the definition of the substitution, and are de- 
fined to cover all disjoint cases, according to the form of the term being substi- 
tuted. 

3 The Type Assignment 

The Types. Let assume to have a set Type-Variables, ranged over by a,/3,7, 
and 5. The types are defined by the grammar: T,p,p,,v ::= Type-Variables | 
T p I !r I §r. The type schemes originate from the grammar: cr ::= 
Voi . . . a„.T, with n > 0 . As usual, V is a binder: the free variables of Voi . . . a„.r 
are FV(r) \ {ai . . .a„}. We say that r is !- exponential, and we write !-exp(r), if 
there is a type t' such that t = Voi . . . a„.!r', with n > 0. 

A set of assumptions takes the form {xi ■ ui,. . . ,y„ : (j„}, where every ai 
is a type scheme, and every \i belongs to Term-Variables U 1-Term-Variables. 
A set of assumptions T = {xi '■ a\,... ,Xn '■ cr„} is well formed if it satisfies 
two constraints: (i) Xi belongs to 1-Term-Variables if, and only if, !-exp(cri) 
holds with n > i > 0; (ii) T can be thought of as a function with finite domain 
{Xi) • ■ • ) Xn}- Namely, x : cti, and x ■ <^2 can not belong to T, if cti yf (T2, up to 
the obvious a-equi valence on types. 

The statement !-exp(T) holds on a well formed set of assumptions T if all 
types in the co-domain of T are !-exponential or, equivalently, if the domain of 
T is a sub-set of 1-Term-Variables. Consequently, a well formed set of assump- 
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tions, whose domain is contained in Term-Variables, is linear. We take F as 
a meta-variable for ranging over generic well formed sets of assumptions, O for 
denoting [-exponential sets of well formed assumptions, and, Z\, <P, W, and T for 
dealing with the linear sets of well formed assumptions. 

The type substitutions are functions from Type-Variables to types. The 
notation: stands for a type substitution that simultaneously 

replaces every Ti for ai, and which is the identity on all the type variables 
different from ai,... ,a„. The type substitutions are ranged over by S', i?, T, 
and U. Moreover, for any type cr and any set F of assumptions, Sa, and SF 
denote the application of the obvious extensions of S to type schemes and sets 
of assumptions. In general, the substitutions do not preserve well formed sets 
of assumptions. For example, Z\ = {a; : a} is well formed, but SA is not, if 
S = {■’’ice}- So, for any well formed set of assumptions T, and substitution S, 
the compatibility predicate S' -compatible with-T holds exactly when SF is well 
formed. Moreover, it holds: 

Lemma 1. If S is S 1 S 2 , and S-compatible with-T, then so it is S 2 . 

The type schemes can be ordered: V/3 i.../3„.t > Vai...am-ST if both 
FV(r) D {/3i,... ,/?„}, and FV(Sr) D {oi,... ,am}, for a given S. 

The Typing Rules. For any well formed set of assumptions F, any functional 
term M, and any type scheme cr, we write the judgment: F Ft M : cr if it is a 
conclusion of a deduction in the following system: 

(Ax)- 

r, X : cr Ft X : c- 



(Vb) 



r l“T M : Vck.cr 



(Vi) 



r i“T ^ '-O' OL ^ r 

r l“T ^ '• Vct.fT 



(^b) 



0, Ai Ft M : t' —o t 0, A 2 Ft N : t' 
0, Z\i, Z\2 Ft MN : t 



F,x '. T \~T M : t' 
F Ft Xx-M '. t —o t' 



r Ft iV :!r' x-r' M -.r 
rFT!(M)r/,] :!r 



(!0) 



Ft FI : r 
FFtIM :!r 



(§) 



m + n + p + q> 1 




0, Ai Ft Mi : n 


(1 < i < m) 


:!!p, 


(1 < i < n) 


0, 'Tk Ft Pk : §/ifc 


(1 < fc < p) 


0,Ti Ft Qi : 


(1 < ^ < v) 


Xl : Tl . . . Xm : Vm, iVl :!pi . . . XCn ..pn^ 




yi : p .1 . . .Pp : fip, Yi :\ui . . .Yq -.\vq Ft M : 


T 


0...Ai...-Pj...Ihk...ri...^T §(M)["^V,, • 


Mm / 

/ Xm 5 


"Vx,- 


Nn / 

/Xr^ , 




. . . f 

/ Vp 5 




] : §r 



'J 
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, l“T M : T Cl "l M : a X a, 0, A 2 Lt N : t 

7 - |_T §M : §r ^ ’ 0, Ai,A 2 hr let X = M in N : r 

Observe that (Ax), (l^), and (§ 0 ) have implicit weakening, while (— °s), (§), and 
(let ) have implicit contraction. 

Definition 1. is the subset of A^a typable by hx, namely, M G A])j^, if, 
and only if, there are F, and a such that F hx M : cr. 

Lemma 2. The following rules are admissible in hx-' 

X ■ F hx M ■. a" a > a' 

X F \-T M : a” 

, , L hx M -.a domain(L^) n domain(L) = 0 FV(M) C V C domain(L) 

{X-r(x)\x&y},r' hxM:r 

Proof. By structural induction on M . 

Clearly, rule (WR) simultaneously weakens, and extends F. 

Lemma 3 (Substitution). 

F Let 0,x : a, M \ o' . Then, 0, <L>, A hx M{^ [ x} : cr', for any 0, T> hx 
N : a. 

2. Let 0, X :!r, Z\ hx M : cr. Then, 0, <T, A hx M{^ I x} '■ cr, for any 0, T> hx 
N :!t such that N is a !-box or it belongs to ! -Term-Variables. 

Proof. By induction on M . 

Theorem 1 (Subject Reduction). Lf F \~t M : a, and XI 's. N , then F hx 
N : cr. 

Proof. By induction on M , using Substitution Lemma here above. 

4 Correctness 

A possible statement of correctness for A?a states: any M G AJa represents an 
algorithm in P-TIME. 

The proof of such a statement is very simple. First, we consider a language 
Gla of graphs. A strategy that reduces any graph G in a time at most polynomial 
in the dimension of G exists. Second, Gla is proved to be a model of So, 
Ala becomes a functional notation for dealing with P-TIME. 

The language Gla is explicitly introduced in Appendix E] It is the graph 
version of the sequent calculus of Intuitionistic Light Affine Logic ILAL in PJ. 

Let use Gla as a model for AJa- First, let > be the reduction relation on Gla, 
which we recall in Appendix \E\ The intuition about > says that, up to some 
irrelevant details, it mimics the cut-elimination steps of ILAL on Gla- Then, 
take O as the equational theory of > on Gla, namely, its reflexive, transitive, 
and symmetric closure. 
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Theorem 2 (Correctness). There exists an embedding (.)* o/ into Gla 
such that, if M N, then M*oN* . 

(.)* is the obvious adaptation to AJa of the embedding of the natural deduction 

d©f 

of Intuitionistic Logic into its sequent calculus, where (let X = N In M)* = 
{{XX.M)N)* . Then, the correctness of AJa is implied by the confluence of >. 

Theorem 3 (Confluence). The rewriting system'^ is confluent. 

This theorem is implied by the strong normalizahility, and the local confluence 
of iIla- particular, TJa is strongly normalizing because, following Girard in 
0, we can embed it into System F The local confluence comes from verifying 
that has not critical pairs. 



5 Expressiveness 

Any polynomial f{xi, . . . ,Xn) of arity n > 0, either linear, or not, can be rep- 
resented as a term {f{x\, . . . , Xn)) of Tla- In particular, the let constructor is 
required when f{xi, . . . ,x„) is not linear. 

Following also a referee’s suggestion, we do not introduce the whole encoding 
of the polynomials. Only the Church Numerals and some operation on them are 
given, together with some further intuition about how the reduction complexity 
is controlled. Those interested to more details about the encoding of polynomials 
are referred to mu, where also the predecessor is introduced. 



The Church Numerals. Let int^. be an abbreviation for !(r — o r) 
For any n > 1, define: 



§(r ^ r). 



d©f 

0 = XX.^Xy.y : int^ 

• d©f 



n+l = XX.^{Xy.yi{...{yny)...)))[ / 









: int. 



Now, define an erasure function on AJa as follows: from a given M, delete 
all the occurrences of !, and §. Then, for any [• • • • • • ] of any box, substitute 

N for X in the body, no matter the forms of N and x are. Finally, erase all the 
interfaces, and collapse Term-Variables and 1-Term-Variables into a single 
set. Of course, the substitutions must avoid variable clash. Applying this erasure 
to any term of Ala would yield a term of the usual A-Calculus. In particular, 
the terms here above, would be mapped into the usual Church Numerals. Some 
combinators on them: 



succ 'M XzX.^{Xy.yi{y 2 y))[^ /y^] : int^ 



int. 



sum AwzX.§(Aj/.?/i(?/ 2 V))[^’"^Vy/"'^Vy 2 ] : intr 



iter XxnXsXb.%{yw)[^''"^‘’'> /y^’’ /n,] ■ int^ ^!(7 



0 int 

t) - 



intT 



d©f 



mult = XxY.iter x \{Xw.sum zw)[ / z] §d : intj 



intT 



§r ^ §' 

lint,- §intT 
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The successor succ , the sum , and the multiplication mult do the obvious thing. 
The iteration iter takes as arguments a numeral, a step function, and a base 
where to start the iteration from. 

Observe that iter 2 (12) (§d) can not have type, for 2, and any numeral 
here above, can not be used as a step function. This because the step function 
is required to have identical domain and co-domain. This should not surprise. 
In the introduction we pointed out that the application of ^ to ^ is the starting 
point to yield exponentially growing computations. 

Notice, however, that there are variations of 0, and 1 that we can use as step 
function for iter : 

o' = Xx.^Xy.y : §(a ^ a) ^ §(a —o a) 
o" = XX.lXy.y :\(a a) ^!(a -<> a) 
l' Xx.^(Xy.yiy )[^ / : §(a ^ a) ^ §(a ^ a) 

l" XX.\(Xy.yiy)[^ /yf\ :!(a ^ a) ^\(a a) . 

We conclude with an example about reduction: 
sum 1 1 

'^1,1 AW§(Ay.yi(y2y))[^^/j/i 

'^1^2 XX.%(Xy.(Xz.ziz)((Xw.wiw)y))[^ / ^ / wf\ 

'^1,1 AX.§(Ay.zi(wiy))[^/^, 

def — 

= 2 . 

For example, the notation stands for a sequence of two contextual appli- 
cations of > 2 , belonging to the §§-group. 

6 Poly-step Reduction Strategy 

^LA proved to have a poly-step normalization process. This means that, 

given a term M, there is a normal strategy such that M rewrites to a normal 
term N with no more '^-steps than the dimension \M\ of M. Notice that, for a 
term M, being poly-step in the above sense, is not exactly as having a P-TIME 
normalization process in \M\. Counting the time would mean to consider at 
least the cost of the renaming operations, and of the substitution of the terms 
for the variables. But this is not an issue, as /Ila is only an abstract language 
to program with, and not an implementation language. 

The proof that is poly-step split in two parts. 

First, we show that the rewriting relation > of the language of graphs Gla, 
is computationally adequate with respect to the rewriting relation of vIla- 
This means showing: 
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Theorem 4 (Adequacy). There exists an embedding (.)^ from Ala to Gla 
such that, if M N, then A'P >* N'p . 

Second, we define a canonical reduction strategy on Then, the exis- 
tence of a poly-time reduction strategy >* for Gla, allows to prove: 

Theorem 5 (Poly-step Adequacy). For any term M o/ A^ a, if M N, 
then MP >* Np. 

Proving Adequacy. The embedding (,)p can not be as simple as (.)*, otherwise 
we could only prove Correctness, as we did in Section 0 

Here, we want more: any reduction step of A^a, in fact, has to stand for a 
reduction sequence of Gla (Appendix El) The problem for showing this is that 
Ala does not have explicit contraction, unlike Gla- Recall that any contraction 
node of Gla determines whether a graph can be duplicated, or not; in Ala, the 
same effect is obtained with two distinguished sets of variables. The result is that 
the situations where a contraction node of Gla gets stuck, correspond to pairs 
of term constructors of AJa that can not annihilate each other. For example, 
consider the following one-step reduction of a term to its normal form: 

P = {XX.xXXy.{M)[yy^] - {XY.x\{M)[^/^y.{M)[^/^]){yz) = Q . 
Taking {.)p as (.)*, the graph PP would be the one expected: 




On the contrary, Qp would be: 




which is not quite what we want: Qp would not be the graph that PP reduces 
to. The difference between (.)* and {.)p must be the ability to detect when a 
configuration in a term of A^a must be translated in the obvious way, as for P, 
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or if it must be simplified, as for Q. Situations like the one here above involve 

d©f 

pair of boxes as well. For example, let P = Xxyz.xyz. Then, take the terms: 

§(pxx)[’(KM)r/.])[«/v]/^] 

§(PXX)[§(!(“)[”/xl)[-‘=/x'-l/^] 

and reduce them: (.)* would not allow to prove adequacy. Summing up, (.)^ must 
be sensitive to the “history” of some pair of term constructors in a given term. 

This need is accomplished by taking a labeling function which maps the term 
constructors A, !-box, §-box, and the application of vIla to natural numbers. 
Then, for any M, {M)p consists of two main steps. The first yields the same 
result, say Gm, as (M)*. The second step operates on Gm by reducing it with 
>. These reductions eliminate only those pairs of nodes of Gm which are images 
of two term constructors with the same label. 

Of course something has to correctly set the labeling. This can be done by 
when yielding the right-hand side of: >5, >5^0, >561 and >|^6- 

Proving tIJa being Poly-step. Let M be given. We say that N is at depth i > 0, 
and we write N'^, if it is in the body of i nested boxes of M . The notion of depth 
can obviously be used also for the redexes, which are the left-hand side terms of 
the rewriting relation > in Section |21 Let us classify the redexes in two sets. The 
/9-redexes belong to the /3-group or to the let -group. The box-redexes are all 
the others. Now, assume M having at most d nested boxes. For any 0 < / < d, 
the i*^ reduction round reduces all the /3-redexes N'^ and all the box-redexes 
in any order. Finally, the poly-step reduction strategy is the sequence 
of reduction rounds which starts from the 1®* and stops (at most) at the 

AIJa is poly-step because the strategy is the the obvious adaptation of 
the poly-time strategy >* on Gla- To get >p as in P, just replace > for 
and let the /3-redexes be the left-hand graphs of >1,3,4,5,6,7,8,9,10 in Appendix IXI 
while the box-redexes be all the other ones. 

7 The Type Inference 

This section is essentially technical. It defines the adaptation to Ala of the 
Damas-Milner type inference for ML 0. For a less verbose presentation it is 
worth introducing some notations. 

Never before used type variables are called fresh. 
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For any sub-set V of Term-Variables U 1-Term-Variables, and for any well 
formed set of assumptions F, F'' stands for F restricted to x G V, while F^ is F 
restricted to x ^ V. 

Let Ti, . . . , Tm . . . ,T^ be any tuple of types. Then, hp {fi = ■ ■ ■ , t„ = 

'’ra} ^1 denotes the obvious algorithm that yields the most general unifier 

U, of the pairs ti = . . . , if any. Otherwise, hp {t\ = . . . , r„ = 

^ failure. 

The identity substitution on types is X. 

For any set of assumptions F, and any type r, the notation VT.r stands for 
the type Voi . . . a„.r, where {ai . . . a„} is FV(r) \ FV(T). 



The Algorithm For any well formed set of assumptions F, any functional term M, 
any substitution S, and any type r, the algorithm for the type inference derives 
two kinds of judgments: either Fti F;M => S';r or Fti F;M^ failure. The 
first corresponds to the success of the algorithm, while, the second to its failure. 

The following rules in Natural Semantics |S| define Fti to derive judgments 
of the first kind. The rules for the second kind of judgment are omitted, because 
obvious. The rules of the algorithm are: 



n > 0 



(Ax) 



Fti 



7i, . . . ,7„ fresh 
r, X : V«1 . . . q„.t; X => 21 ; St 



Fti F; M => Sm;tm 
7 fresh 

Fu {tm =17} => 1/ 

Fti X : (VSMF.It/7), SmF{x}; N^S'iv; rjv 
SnSm -compatible with- F 
Fti F ; let X = M in N ^ SnSm’, tm 



{-^e) 



hxi F;M ^ Sm\tm 
hxi SmF ; N =7 Sn;tn 
7 fresh 

Fu [Sntm =tn 7}=7t/ 
t/SivSM -compatible with-F 
Fti F; MN ^ USnSm; F7 






7 fresh 

Fti F{j.pa; : 7;M =7 S;r 
S -compatible with-(F{2,}, x : 7) 
S -compatible with-Fl^l 
Fti F ; \x.M =7 S; S7 — ° r 



i^ix) 



7 fresh 

Fti F{x}, X M ^ S\t 
S-compatible with-(F{x},X :l7) 
S-compatible with-Fl^l 
Fti F-XX.M =7 S; (!S7) ^ r 



Fti F ; X =7 Sjv; tm 
7 fresh 

Fu {tjv =17} => U 
not(!-exp([/7)) 

Fti X Uy,U SjvFj^,}; M =7 Sm;tm 
SmFSiv - compatible with-F 

Fti F ; \{M)\^ / =7 SmU Sn', \tm 

Fti 0; at =7 S; r 
S -compatible with- F 

(! 0 ) 



Fti F; N ^ Sn', tn 
7 fresh 

Fu {"Tiv =!! 7 } =7 U 

Fti X :\Uy,U SN^^x}', M =7 Sm; tm 

SmUSn -compatible with- F 

Fti F ; !(M)[‘^/x] =7 SmU Sn', \tm 

Fti 0 ; At ^ S', t 
S -compatible with- F 



Fti F; !M ^ S; !r 



Fti F;§M ^ S; §r 
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Fti Si-1 • • 


■SiF-Mi ^ Si-,n 


(1 < i < m) 


Fti Rj-i ■ 


■ ■ RlSm ■ ■ ■ SiF ; Nj =7 Rj-, pj 


(1 < i < n) 


Fti Tk-i ■ 


■ ■ TlRn ■ ■ ■ RlSm ■ ■ ■ SlF\ Pk =7 Tk\ Pk 


{1 < k < p) 


Fti Ui-i ■ ■ 


■ UlTp ■ ■ ■ TlRn ■ ■ ■ RlSm ■ ■ • SlP ',Ql ^ Ul'jVl 


(1 < ^ < g) 



U 

U 



71 . . . 7m, Q 1 • • • a„, /3i . . . /3p, (5i . . . fresh 
hu Ul<i<m{Uq ■ ■ ■ U\Tp ■ ■ ■ T\Rn ■ ■ ■ RlSm ' ' ' Si+lTi U 

Ui<j<n{Cfi3 • • • U\Tp ■ ■ ■ TiR„ ■ ■ ■ Rj+ipj =\\aj} 

Ui<fc<p{L^g ■ ' ' UiTp • • • — §/^fe} 

Ul<!<g{t/g • • • Ul + \Vl = §!(5;} 
not(!-exp([/ 7 i)) 
not(!-exp([//3fe)) 

R\ — {xi .7i,-- - - 7^, ^1 . !rri , . . . , Xn , 

yi : Pi,. . . ,Vp : Pp,Yi , Y, :!<5p } 

R2 = Uq ■ ■ ■ UlTp ■ ■ ■ TlRn ■ ■ ■ RlSm ' ' ' Sl-T{ 2 ,j^ 

Lti Uri,UR2-,M ^ Sm;tm 



U 



(1 < i < n) 

(1 < fc < p) 






(§)- 



SUUa 



UiTp ■ ■ ■ TiRn ■ ■ ■ RlSm ■ ■ ■ Si -compatible with- F 



l“Ti F ; 






XI 






^7 
^7.1 
'^7n 



. Mn 

. 

.Qc 



7.™ 

/x„ 

/ Vp 

lYq 



] =7 SuUUq ■ ■ ■ UlTp ■ ■ ■ TlRn ■ ■ ■ RlSm ■ ■ ■ Si', ^TM 



The last rule is a “nightmare” because of the complexity of the type assignment 
rule for the §-box. For example, the line: 

Fti Si-i ■ ■ ■ SiF;Mi ^ Sf, n {I < i < m) 



stands for the statements: 



Fti r ; Ml =7 Ti 
Fti SiF ; M2 =7 S2', T2 



bxi Sm-i ■ • • SiF; M„ 



Theorem 6 (Correctness). Let F be well formed. If \~ti L;M =7 S;t, then 
Sr Ft M : t. In particular, S' -compatible with-T. 

Proof. Induction on the calls to Fti, using Lemma 



Theorem 7 (Completeness). Let F, and SF be well formed. If SFl-^M :a, 
then \-Ti F;M ^ Sm',tm , and there is S such that S=SSm, and S(\/SmF.tm)> 
a. 

Proof. Induction on the length of the derivation of SF I~t M : r, using Lemma^ 
and|^ 

Completeness states that any type assigned by Ft to M can be obtained as 
an instance of the type r that Fti infers for M . So, r is principal for M . 
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8 Conclusions 

This work has presented an untyped functional language Ala which has a sub-set 
Ala that can be typed automatically by a type inference algorithm. The types 
for the terms of AJa are polymorphic formulas of Intuitionistic Light Affine 
Logic. 

The main properties of AJa are related to the functions it can represent, 
and to the complexity of its rewriting system. Every term of AJa represents a 
P-TIME algorithm. Moreover, there is a poly-step reduction strategy: it gets to 
the normal form of any term M in, at most, a polynomial number of steps, with 
respect to the dimension of M. Finally, Ala is Church-Rosser. So, we can think 
of it as a programming language to deal with algorithms with a computational 
complexity which is predictable and, at least in principle, reasonably low. 

However, Ala still needs improvements. Its syntax is still quite heavy. The 
main goal, like also a referee suggested, is to eliminate the interfaces from boxes. 

The completeness of AJa is still open. 

There are other languages to program algorithms in P-TIME pj E). The 
weakness of AJa, with respect to them, is that completeness is still open. Its 
strength come from its clean logical base. 
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A The Language of Graphs 



This section introduces Gla- The language Gla is the graph language Asperti 
refers to in P to prove that the sequent calculus for ILAL captures P-TIME. 



The language Gla is the least set of graphs containing the wires: 



T which 



correspond to the axioms, and such that, if G, and G' are in Gla? then the 
following graphs belong to Gla as well: 





Tl • • • Tril ^ I Tj • • • 

Cutting two derivations. 




Contraction. 




Weakening. 





T 


n • ■ 


^ VI 

r\ 

r^L:_ 



!-box. 




The elements of Gla have a single upward link: the root. They have also a, pos- 
sibly empty, set of sticking down links: the inputs. Multiple inputs are denoted 
by thick lines. 

The rewriting system > is the least relation on Gla x Gla containing the 
contextual closure of a relation > between parts of graphs. The relation > is: 



app 



>1 



r 
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The rule >2 is defined with the following proviso: if 0 = §, then 0^ G {!,§}• 
Otherwise, if 0 =!, then can only be !. Of course, the bounds for n,p, and q 
are consistent with these two cases. Moreover, every 0i,0', 0^ range over the 
set {!, §}. Rule >s applies to both cases 0 =!, and 0 = §• The rules > 9 , and >10 
are a sort of garbage-collection. 

We insist recalling that both Gla and its rewriting relation > are the language 
effectively used by Asperti to capture P-TIMEp. 

The reflexive, and transitive closure >* of the rewriting system here above 
is locally confluent, and strongly normalizing. Hence, it is also Church-Rosser. 
The strategy able to reduce to any graph G in poly-time, with respect to its 
dimension, is recalled in Section P 
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Continuation Passing Style Computation 
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Abstract. We show that the one can consider proof of the Gentzen’s LK 
as the continuation passing style(CPS) programs; and the cut-elimination 
procedure for LK as computation. To be more precise, we observe that 
Strongly Normalizable(SN) and Church-Rosser(CR) cut-elimination pro- 
cedure for (intuitionistic decoration of) LKT and LKQ, as presented 
in Danos et al.(1993), precisely corresponds to call-by-name(CBN) and 
call-by-value(CBV) CPS calculi, respectively. This can also be seen as 
an extension to classical logic of Zucker-Pottinger-Mints investigation of 
the relations between cut-elimination and normalization. 



1 Introduction 

Continuation Passing Style(CPS): Since Griffin’s influential work m on 
the Curry-Howard correspondence between classical proofs and CPS programs, 
there has been a lot of interest on programming in classical proofs. It is because 
these classical calculi relate to important programming concepts such as non- 
local exit or exception handling. In Griffin’s result, Plotkin’s call-by-name (CBN) 
CPS translation on simply-typed A-calculus induces a Godel’s double-negation 
translation on their types. 

Proof theory: There is a long line of proof theoretical approaches to under- 
standing “deconstructive” classical logic. That is, classical logic that has Strongly 
Normalizing (SN) and confluent (Church- Rosser or CR) cut-elimination proce- 
dure. This thread began with Girard’s linear logic(LL)[Sl, followed by LC Pj 
and the logic of unity (LU) [in|. It reaches to LKT and LKQ 0 and more 
general ^ through Danos, Joinet and Schellinx (DJS). These works are 

all based on logic in Gentzen-style sequent calculus [Zj. 

We unify the proof theoretical approach (i.e. SN and CR cut-elimination 
procedure) and the reduction system approach (CPS). Otherwise said, we found 
new, strict Curry-Howard isomorphism between Gentzen-style classical logic 
and programs. As a slogan, it can be said as “classical proofs as programs 
and cut elimination as computation” . Particularly Classical Natural Deduction 
(CND)|2D| style programs and its computation are interpreted by LKT and 
LKQ proofs and cut-elimination. We also show that Plotkin’s CPS translation, 
in fact, can be understood as the translation from ND terms to CND terms. 



Jieh Hsiang, Atsushi Ohori (Eds.): ASIAN’98, LNCS 1538, pp. 61-[2| 1998. 
(c) Springer- Verlag Berlin Heidelberg 1998 
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LKT and LKQ are variations of Gentzen’s original system LK. Confluency 
is recovered by adding some restrictions on logical rules to LK, though sound- 
ness and completeness w.r.t. classical provability is still retained. DJS proved the 
confluency of cut-elimination procedure for LKT and LKQ by using a beau- 
tiful relation with the multiplicative exponential part of classical linear logic 
(MELL), called inductive linear decoration and taking skeleton. Deco- 
ration is a kind of sound and faithful embedding between two logical system. 
The key is, computational properties are preserved between the original sys- 
tem and the decorated system. From this property, one can see that confluency 
of cut elimination procedure for LKT and LKQ is an immediate corollary to 
confluency for MELLp. 

In this paper, we focus on the intuitionistic version of decoration which is 
analogue of linear one. Our method can be seen as “yet another proof” of SN 
and CR property of cut-elimination for LKT and LKQ, since typed A-calculus 
is also proven to be SN and CR CD- Our contribution is to the notion how 
(intuitionistic decoration of) LKT and LKQ relate to the typed A-term assign- 
ments, hence LK; and to the observation that this is identical to CPS programs 
through the consideration on CND interpreted by LK. To our knowledge, this 
is the first paper that shows the direct Curry-Howard correspondence between 
Gentzen style classical logic and CPS programs. Historically, CPS programs are 
studied under natural deduction style — i.e. term of the form of abstraction(— > 
introduction) and application(— > elimination). 

Related Works: However at least CBN part of these frameworks should be 
considered as folklore. Girard already suggest the relation between A-calculus 
and ILU in nm. Also DJS themselves gave a guess in the final remark of |3| and 
in there introduction of ^ about the relation between LKT, negative fragment 
of LC and Parigot’s CND. DeCroote revealed the relation between CBN CPS 
and Parigot’s A^-calculus which is an computational interpretation of CND jS|. 
Moreover there is a detailed work of HerbelinjEl about term calculus on LJT 
which is an intuitionistic fragment of LKT. He interpret LJT proofs as programs 
(A-calculus) and cut-elimination steps as reduction steps, as we do. We show 
that A-calculus is completely included as an intuitionistic case of our CBN CPS 
calculus. 

We also have to mention to the pioneering work of Murthy^5|. He shows that 
one can interpret the proof of Girard’s LC (of which negative fragment is LKT, 
positive fragment is LKQ) by CPS programs through the method called “intu- 
itionistic extract” . This is quite similar to our intuitionistic decoration method. 
However he can’t give an answer to the question “appropriateness of this term 
extraction method for LC” . It is because he didn’t consider the relation between 
the computation and cut-elimination. By the way, Murthy’s paper includes good 
references for classical control calculi which we omit in this paper. 

The idea of assigning typed A-terms to Gentzen style logical system, itself, 
is not new. As there are an early proof theoretical works on LJ by Pra,witz,|24j. 
ZuckerpBj. Pottinger[23| and Mintsp^. Their method is essentially CBN be- 
cause of the intrinsic cut orientation of LJ. However, our term assignment is 
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far more precise, because we consider more general classical case. We simulate 
cut-elimination step for classical LK proofs through normalization. As a result 
SN and CR cut-elimination procedure on LK are simulated by normalization. 
This detailed analysis results in two different simulation (CBN and CBV) if we 
restrict LK to LJ. 

Throughout this paper, we only handle propositional logic. We also put em- 
phasis on CBV case, as the CBN case is the obvious analogue to CBV case. Some 
tables for CBN systems are given in appendix. We only employ implication (— >) 
as logical connectives, since this will be sufficient for us to explain our subject. 
However it can be extended to second order (system F polymorphism 0). 



A ; A 



/ . s r ^ A- A 



ro=>Ao;A ri,A^ Ai- n ro^Ao,A-n ri,A^Ai; 

- Co, A ^ ^ 0,^1 ; n 






r ^ a: n , r, A, A ^ A' n ^ , r ^ A \ n , r ^ A, A, A \ n , 

(LW) ’ ’ ^ ’ (LC) (RW) (RC) 



r,A^A \ n 



r,A^A \ n 



To ^ Ao ; A A, R ^ Ai ; 



(L- 



(R- 



A.A, A ^ R => Ao,Ai ; ^ r^A\A^B 

Table 1. Original Derivation Rules for LKQ 



2 Decoration 
2.1 Notations 

In this paper, we entirely use the indexed formula version of logical system. We 
use six logical systems. LK and LJ by Gentzen |Zj, and LKT and LKQ by 
DJS |3j. We choose our notation to adapt that of DJS 0. CND is an extension 
to classical logic of natural deduction (in sequent style representation) by Parigot 
PSEDIEJ. It has several conclusions on left-hand-side of sequents. We also add 
ND as an intuitionistic case of CND. 

Formulas are that of propositional logic constructed from We use same 
implication symbol between logical systems. Indexed formula is an ordered 
pair of a formula and an index. We assume there are denumerably many A- 
indices (resp. ^-indices) ranged over x,y, z . . . (resp. a, /3, 7 , . . . ). We write an 
indexed formula (A,x) as and (A, a) as A“. 
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Sequents of each logical systems are of the form as follows: 



LJ 


F 




A 


LK 


F 




A 


LKT 


n 


h . 


F ^ A 


LKQ 


F 




A- n 


ND 


F 




A 


CND 


F 




A 



where => is the entailment sign of the calculus. We use rhs and Ihs for the 
right-hand-side and left-hand-side of the entailment sign, respectively. F is a A- 
context which is a set of A-indexed formula. Z\ is a /r-context which is also a set 
of /i-indexed formula. Comma means taking union as sets. Thus the set Fq U Fi 
is denoted by “Fq,Fi’\ {A^} U T by “A^,F.” 

The rhs of the sequent of LJ and ND is an unindexed formula which may 
be a fixed arbitrary atomic formula (j). We define intuitionistic negation ^A as 
A^cj). 

In LKT, 7T^ denotes at most one A-indexed formula. In LKQ, n denotes 
at most one unindexed formula. The place on the left of semi-colon where FI^ is 
located in LKT is called stoup according to Girard|D]. We also call this specially 
placed A-index as head-index and always denote by h. We also call the place on 
the right of semi-colon as stoup in LKQ. 

In our version of CND, there is no unindexed formula in Ihs. This is contrast 
to Parigot’s original system, where the rhs of CND has exactly one unindexed 
formula (called current formula) . 

If (fi maps indexed formulas to indexed formulas, then if T = Ai^^ , . . . , An ^'^ , 
we write ipF for the set ip{Ai^^), . . . , ip{An^"). For example, for , 

. . . , where “t” maps an LKT formula to an LJ formula. 

The following conventions are used in distinguishing between occurrences of 
indexed formulas in a given logical rule, e.g. (L of LK: 

ro^Ao,A°‘ ri,By^Ai 
(A ^ To, A ^ zio, 

The indexed formula {A B)^ is called the main formula of the rule with 
main connective — the occurrences of A°^ and in the premises will be referred 

to as the active formula. All other occurrences are said to be passive and will be 
referred to as the context. In the special case of cut, active formula occurrences 
are also termed cut-formulas. 

We only handle multiplicative rules in every logical system. That is, A-contexts 
(and ^-contexts) in the conclusion are the union of A-contexts (and /r-contexts) 
in the premises. 

Hereafter, in order to improve readability, we only write active and main 
formula, and omit context as follows: 
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^ By ^ tr t 

In our logical system, structural rules are implicit. As we interpret con- 
texts as sets, occurrences of a formula with the same index are automatically 
contracted. One can interpret this that binary rules are always followed by ap- 
propriate explicit contractions, which is renaming of index to the same name. 
We also interpret axiom rules contains appropriate weakenings as contexts. 
Notice that in LKT and LKQ, application of structural rules are restricted 
within r and A. Specifically 7T^ and II (i.e. formulas in the stoup) is not the 
subject of weakening. 

Initial index is an index which appears for the first time in whole proof. We 
assume all initial indices are distinct unless they are truly related(i.e. subject 
of further implicit contraction). This is possible, by introducing the “concate- 
nation” to indexes on every binary rules. See for Zucker|2El for detail of this 
method. 

We quote the original derivation rules for LKQ from P| in Table d Notice 
that there is a difference in the definition for structural rules and contexts, as we 
mentioned above. In the original definition, contexts are interpreted as multisets, 
and structural rules are explicit. 

We here explain abbreviations which appears in names of derivation rules. 
The letter L/R stands for Left and Right introduction, D for Dereliction, 
C for Contraction and W for Weakenings. Notice that we use the different 
name of cut rules from DJS. We use h-cut instead of “head”, m-cut instead 
of “mid” and t-cut instead of “tail”. This is needed to avoid confusion because 
these are too common word to refer as technical term. 



MELL 



LK 



LJ 



decoration 

• -< • 



decoration 

► • 



constrictive morphism 

cut elimination 

skeleton 



constrictive morphism 

LKT and LKQ(A/m/A/rv) = CBN/CBV - CPS 
cut elimination 

skeleton 

► • -< • 



Table 2. Reduction Preserving Properties 
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2.2 Decorating LKT and LKQ 

In this subsection, we briefly introduce LKT and LKQ and (linear and intuition- 
istic) decoration method j30|, on which our method is based. We shall assume 
the basics of multiplicative exponential part of classical linear logic (MELL), 
and the cut elimination procedure (linear procedure) for MELL; See e.g. [25j for 
introduction. 

LKT and LKQ are embeddable into the MELL by means of linear dec- 
oration of DJS P]. Under linear decoration, cut elimination for LKT/LKQ 
becomes an immediate corollary to cut-elimination for (second order) MELL, 
as reductions of the linear or intuitionistic decoration of a derivation tt become 
reductions of the LKT/LKQ. “(Taking) skeleton” is the reverse operation of 
decoration. It means “simply forget about exponentials (!,?)” from derivation. 

They also are embeddable into LJ by means of intuitionistic negation 
decoration (or simply intuitionistic decoration) which is analogue of linear 
decoration. 

Definition 1 (DJS). Decoration on formulas are defined as follows: 

A^,A'^ := A (for A atomic) 

LKT/LKQ to LJ 

{A Bf := i^^A*) (--B‘) 

(A ^ B)« := A« ^ (— B«) 

LKT/LKQ to MELL 

{A^Bf := 

{A^B)'i := ! 

Proposition 1 (DJS). Decoration on sequents: following embeddings are both 
sound and faithful. 



MELL h 7T‘,!?r* ^ ?A* 


iff 


LKT h 77 ; D^ A 


iff 


LJ h 77*, —>^7”*, ^Z\* (f 




MELL h !r« ^ ?! Z\«,77« 


iff 


LKQ h r ^ Z\ ; 77 


iff 


LJ h ^ 77« 





Remark 1. DJS describes the intuitionistic q-translation on formulas as (A ^ 
S)9 := (^^9) ^ (^A9) in g. However it is not so different from our defini- 
tion. This only affects to the order of /-abstraction and application for term 
assignment in L — > and R ^ (See next section). We obey the traditional CBV 
translation. 
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Proposition 2 (DJS). reduction preserving properties: Cut elimination step 
for LKT and LKQ is one-to-one to the cut elimination for its linear decoration. 

Proposition 3 (DJS). Cut- elimination procedure for LKT/LKQ is Strongly 
Normalizing (SN) and Church-Rosser (CR). 

Because Cut-elimination procedure of MELL is proven to be SN and CR |2|. 
All properties above for linear decoration also holds for intuitionistic decoration. 

Proposition 4 (DJS). LKT and LKQ is sound and complete w.r.t. classical 
provability. 

Soundness is easy, as ignoring semicolon in LKT and LKQ derivation induce 
LK derivation. For completeness, DJS shows a concrete method (constrictive 
morphism) through which one can convert any LK derivation into LKT and 
LKQ derivation. 

We display above facts as diagram in Tabled 

3 Continuation Passing Style(CPS) Calculus 

3.1 Typed A-Calculns 

Raw A-term Raw A-terms are defined as follows: 

s,t ■= X A- variable 

I Xx^.t A-abstraction 

I st application. 

Application associates to left, i.e. we write “stu” instead of “(st)u”. We 
identify the set of A-variables with the set of A-index of formula. We use the set 
of free variables in term t denoted as FV(t) in usual sense. Otherwise said, 
in the term assignment judgment, each type of A-variables that occur free in 
A-terms are identified with the formula which is A-indexed of the same name. 
We let V range over values which are A-abstraction or A-variable. 



Substitution t [x^ := s] means the standard substitution as meta-operation. 
It is the result of substituting s for the free occurrences of x (of the same type 
as s) in t which is defined as follows: 

X [x^ := s] = s 

y := s] = y, if x^y 

{Xy^.ti) [a;^ := s] = Xy^ .{h [x^ := s]) 

(ti h) [x^ := s] = (ti := s])(0 [a:^ := s]), 



In the third clause it is not needed to say “provided that y ^ x and y is not free 
in s”, by our assumption on initial indices. 
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★ 

/3- Contraction c> denotes one step /3-contraction and c> denotes its re- 
flexive transitive closure, as usual: 

(/3) {Xx^.t)s > t [x^ := s] 

We often omit to indicate the type of A-variable in A-abstraction in case it is 
clear from the context. The A-term of the form {Xx.t)s is called /3-redex. t is 
normal iff no subterm of t is a /3-redex. The result of /3-contraction on /3-redex 
is called /3-contractum. 



(Ax) 



V : => A 



x: ^ A ^ ' kv: (-lA) 



(D) 



v: => A t: A^ ^ C (■’A)'' =M7 t: A^ 



{Xx.t)v : ^ C 



{Xk.s){Xx.t)\ => (7 



(m-cut) 



t: (/) V : 



A 



(L 



t: A^,(^B)‘ 



(R- 



mv{Xy.t): (A ^ </) ' 'Xx.Xl.t-. ^ A ^ -<^B 

Table 3. CBV CPS term assignment to intuitionistic decoration of LKQ 



3.2 Definitions 

Term assignment judgment (or simply judgment) is an ordered pair of A-term 
and LJ sequent. We write a judgment (s, T C) as s : T C. Derivation 

rules define the term assignment judgment. Derivation) is a tree of derivation 
rules of which leaves are axioms, of which nodes are derivation rules other than 
axiom. 

We use 7T for the derivation. End judgment s: T C is the judgment 

which appears as root of the derivation tree tt. We call the sequent T => C 
as proved sequent of derivation tt. We call the term s as a assigned term 
to the derivation tt, and refer to it by the notion of TermOf(7r). We say the 
term TermOf(7r) is typable by proved sequent. Two derivations are equal if 
they differ only by the index of formula in the proved sequent. We say that the 
derivation is cut-free if it contains neither m-cut nor t-cut. 

Let 7T be the derivation of which last inference rule is m-cut. We call the 
TermOf(7r), which is a /3-redex, as m-cut redex. t-cut redex is defined in the 
same way. 
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3.3 Term Assignment to LKQ 

We categorize A-variables into 2 groups — object and continuation variable. 
We use x,y,z^... for object- variables and identify with A-indices. We use 
m,n,l, . . . (instead of a, /3, 7 , ... ) for continuation- variables, according to the 
standard notion of CPS; and identify with ^-indices. 

To determine the term assignment, we interpret the Gentzen-style LJ deriva- 
tions by translating them into ND derivation. As an example, we display L ^ 
rule interpreted by ND as follows: 

to: {A ^ ^ a ^ V. ^ / g'l t\ B^ ^ (j) 

mv : {A — > ^^B Ay.t : ~^B 

mv{Xy.t) : {A (j) 

For the name of natural deduction style rules, we use (E) for elimination and 
(I) for introduction. 

Our term assignment to LJ, which is the intuitionistic decoration of LKQ, 
is displayed in Table 0 We call this term assignment system as call-by- value 
continuation-passing style (CBV CPS) calculus. 

3.4 Cut-Elimination and Normalization 

In this subsection, we prove that the cut-elimination step for LKQ is one-to-one 
to the normalization, thus can be simulated by /^-contraction on typed A-term. 
We prove this by showing that both m-cut and t-cut satisfies this property. 

Our term assignment faithfully reflects the structure of sequent calculus. 
Thus inductive definition of substitution on terms agrees with induction on the 
length of derivation for sequent calculus. We can state this formally as follows: 

Proposition 5. (subterm property) In every derivation rule, all terms of prem- 
ises are included as subterm in the term of conclusion. 

Proof. Mechanical checking of term assignment for each derivation rules. 



Proposition 6. The derivation tt is cut-free iff TermOf{'K) is normal. 

Proof. (=>) By induction on the length of derivation. E.g. for L — by induction 
hypothesis v and t are normal, hence mv{Xx.f) is normal. (<^=) Obvious as term 
of conclusion of t-cut and m-cut themselves are /3-redexes. 

To relate the permutation of premise of the cut (during cut elimination step) 
to the conversion of the term, it is good to introduce the notion of explicit 
substitutionP . 

Definition 2. (explicit substitution) Explicit snbstitution is of the form 
t {x^ := s). In this framework, (3 -contraction divided into two phases. One is 
“syntactic” (I- contraction, and the other is explicit rules for substitution — i.e. 
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we consider each inductive definition of substitution as computational step, in- 
stead of meta- operation. The syntactic (3 -contraction is defined as follows: 

{Xk~^"^.s){Xx.t) t> s := Xx.t). 

Thanks to the subterm property, each one-step permutation of premise cor- 
responds to few steps of explicit substitution. Now we start the technical matter. 

Proposition 7. (m-cut elimination) The derivation ir converts to tt' by one m- 
cut elimination step iff TermOf{Tr) t> TermOf^ir') by contracting the m-cut redex 
associated to the m-cut. 

Proof. Cut elimination step for LKT and LKQ is the obvious analogue of the 
linear procedure liE] Thus right premise of the m-cut is going to be dupli- 
cated/erased according to the contraction/weakening on . 

The m-cut shall change into (zero, one, or many) t-cut rule(s). We assume 
this m-cut redex is of the form (Xk.s){Xx.t). By using explicit substitution, this 
syntactically contracts to s {k := Xx.t). 

case 1. implicit structural rules We take (t-cut) as an example for binary 
rules. We display how m-cut permute right premise (i.e. t: => f) with 

(t-cut) with implicit contraction in Table 0 in appendix. By the definition 
of substitution implicit contraction duplicates Xx.t, as is expected. As is the 
case of implicit weakenings, it erases Xx.t. 
case 2. derelictiou If the rule is (D), assigned (sub) term is of the form of 
kv. This term is contracted to {Xx.f)v, which represents another t-cut. See 
Table E3 



Propositiou 8. (t-cut elimination) The derivation tt converts to tt' by one t-cut 
elimination step, iff TermOf^rr) > TermOfijr') by contracting the t-cut redex. 

Proof. Elimination of t-cut means that left premise of the t-cut is going to 
be duplicated/erased according to contraction/ weakening on A^. T-cut is then 
eliminated at the point where object variable is introduced — i.e. axiom or (L ^) 
rule. We assume this t-cut is of the form of {Xx.t)v. This contracts to t {x := v). 
Cut-formula is A^ . 

case 1. implicit structural rules it is almost the same with the previous dis- 
cussion on m-cut. V will be duplicated/ erased according to the number of 
occurrences of x. 

case 2. producer of iuitial iudex In this case, the (3 contraction on the t-cut 
redex corresponds to the elimination of the t-cut. Explicit substitution step 
corresponds to permutation of the left premise of t-cut. 
case 2.1 (Ax) In this case, Xx.t = Xx.x. The t-cut contractum is v. This 
contraction corresponds to elimination of t-cut together with axiom 
which appears as right premise of t-cut. 
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case 2.2 (L In this case, Xx.t = Xm.mv'{Xy.t'). Thus t-cut contractum 
is vv'(Xy.f). Now we will go into sub-cases according to the derivation 
rule that introduced value v. 

case 2.2.1 (Ax) In this case, v = m' and the t-cut contractum is 
m'v' {Xy.t'). This only means renaming of variable from m to m' . This 
corresponds to elimination of axiom which occurs as left premise of 
t-cut. Recall that, by definition, derivations that differs only by the 
index of formula is equal derivation. 

case 2.2.2 (R In this case, v = (Xx.Xl.t) and the t-cut contrac- 
tum is {Xx.Xl.t)v' {Xy.t'). If this newly created /3-redex represents 
the combination of t-cut and m-cut redex, then the proof is done. 
See. Table^E^ 

It remains to check whether {Xx.Xl.t)v' {Xy.t') is equal to {Xx.{Xl.t){Xy.t'))v' 
or not. However, from our assumption on initial indexes, x, I does not appear 
in neither v' nor t' . Thus we can always exchange the order of two explicit 
substitution: {x := v'), {I := Xy.t'). This means that these substitutions are 
essentially parallel. In fact, this is exactly how restrictions on LKT and LKQ 
works to avoid so called q/t dilemma, and to recover conflueiQj^l S] • 



Remark 2. We choose q-decoration A'? ^ {^^B'^) instead of DJS’s {^B'^) 
{-^A'^). This only affects on the order of abstraction (such as XL Xx.t ) and ap- 
plication (such as m{Xy.t)v). 



(Ax„) 



t-. a^,{-^b)' 



kx: m{Xx.Xl.t)-. AA ^ 



(R 



s: AAy^(j) t: A" 



(cut„) ■ 



t: S-. (^A)'“ ^ <l> 



(L 



{\k.s)(Xx.t) : 4> '' " ' [\k' .s)(\n.mn{Xy.t)) : (A ^ -i-iB)"* 0 

Table 4. CBV CPS term assignment to intuitionistic decoration of LK 



3.5 Term Assignment to LK 

Lemma 1. (completeness) LKQ is complete w.r.t classical provability. 

Proof. We show one can convert any (intuitionistic decoration of) LK derivation 
into (intuitionistic decoration of) LKQ derivation by induction on the length 
of derivation. The only interesting case is (L ^). We have “t: B^ and 

“s: {^A)^ (/)” as induction hypothesis. We calculate this as follows: 
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n : A 

s: (^A)^ ^ mn(Xy.t): (A ^ , A'^ ^ (jj 

{\k' .s){\n.mn{\y.t )) : {A ^^B)^ => (p 

Constructive proof above automatically generates the CBV term assignment 
to (intuitionistic decoration of) LK. q 

Theorem 1. Typed X-term assignment shown in Table ^ defines the SN and 
CR cut elimination procedure for LK . 

Proof. All of the proposition^Ji this section and the fact that typed A-calculus 
is proven to be SN and CR [TT] . 

Cut-free LK derivation does not always mean cut-free LKQ derivation, since 
(L in LK derivation converts into LKQ derivation including one “additional” 
m-cut. This m-cut is called correction cut. Eliminating of this m-cut (i.e. 
contracting m-cut redex) exactly means constrict^e morphism'll]. i-i 
We also display CBN term assignment in Table Q) for LKT ) and TableQfor 
LK ) in appendix. 



T (Ax) ^ I) 

xk : (— 1 — lA)^, (— lA) => d k[Xx.Xk\t): — i— iC, — 1 (— lA — ^ li?) 

u: -i-iro, A ^ -■-■S)'' ,^Ao^(j} s: -i-iA, (-lA)* , -'Ai => 0 ^ 

[Xk" .u)iXm.miXk' .s)k) : — i— i7~b, (“'7?)^, ~iAo, “'Ai 

Table 5. CBN CPS Term Assignment to intuitionistic decoration of CND 



j (Ax) (^ I) 

kx: A^,[-^Af^f k[Xx.Xk'.t)-. r,^[A^ -:-^Bf,-.A^ cf> 

u: A), “'(A ^ ,^Ao^(/> s: ri,[-^Af‘ ,~^A\ ^ ^ 

[Xk" .u)[Xm.[Xk' .s)[Xn.mnk )) : A, A, (-■B)*^, -nAo, -’Ai ^ f 



Table 6. CBV CPS Term Assignment to intuitionistic decoration of CND 
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3.6 Term Assignment to CND 



Our term assignment can be extended to (intuitionistic decoration of) CND. 
Namely, CND derivations are interpreted by LK derivation, and normaliza- 
tion is interpreted by cut elimination. This interpretation enables us to com- 
pare our CPS calculi with standard CPS calculi of which logical base is natural 
deduction(in the next section). The only interesting case, in interpretation, is 
application(i.e. E)) rule, since abstraction (i.e. (— > I)) is directly read off 
as (R — >). E) is interpreted by the combination of (L ^) and cut rule. 



(^A) 



k' 



kh: 



(Ax) 



(-i£))^ => (f> m{Xk' .s){Xh.kh) : (-i-iA 






{Xk” .u){Xm.m{Xk' .s){Xh.kh)) ■. {~'B) ^ (f) 



(L-) 

(h-cut) 



s: ky. Byy-^Bf 



(Ax„ 



u: (“'E) => 0 {Xk' .s){Xn.mn{Xy.ky))\ (A 



■^Bry^BY ^(j> 



{Xk" .u){Xm.{Xk' .s){Xn.Tnn{Xy.ky))) : {^B) => (j) 



(L -^v) 
(cut„) 



where D = ^^A ^ ~^^B and E = A ^ ~^^B. Finally we get typed A-term 
assimments for (intuitionistic decoration of) CND. We display them in Tahiti 



and 



4 Related Works 



4.1 Plotkin’s CPS Translation 

□ 

We quote the Plotkin’s CPS translationj23- For CBN: 



X = Xk.xk 
Xx.L = Xk.k(Xk.L) 

MN = Xk.M(Xm.mNk) 



For CBV: 



X = Xk.xk 
Xx.L = Xk.k(Xx.L) 

MN = Xk.M( Xm.N( Xn.mnk)). 



Proposition 9. Plotkin’s CPS translation is exactly the one that translates ND 
terms into CND terms. 
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Proof. In intuitionistic case of CND, /i-context A always contains only one 
continuation variable. We name this as k. In order to see the relation, we only 
need some renaming; Afc'.t = L, Xk" .u = M, Afc'.s = N_. With this renaming, 
they are identical to the terms of our typed A-term assignment to CND (See, 
again, Table 0 and 0 • 

Plotkin shows that CPS translated A-term simulates CBN/CBV reduction 
strategy of original A-term. This precise relation justifies our claim — CBN / CB V 
precisely corresponds to t/q colouring of the formula. He also pointed out, among 
these reductions, there are two kind of reductions. One is proper reduction and 
the other is administrative one. We can state this precisely(in CBV case) — 
elimination of t-cut is the proper one and represents intuitionistic computation; 
and elimination of m-cut is administrative one and represents classical com- 
putation. Among the administrative reductions, some of them are constrictive 
morphisms and others are classical(e.g. continuation) calculations. 

4.2 Felleisen’s AC Operator 

In our CPS calculus, Felleisen’s C operator|H| is a cut-free derivation of peirce’s 
law. We mechanically calculate them as follows: 



where type of k is ~^A. Type of y is ^ B) ^ A)* (CBN), ((A ^ B) ^ A)‘^ 

(CBV) respectively. 

Remark 3. In CBN, Felleisen uses identity function (i.e. top-level continua- 
tion) of type A —>■ A, instead of our Xh.kh of type ~^A. That is, he fix (f) to A. 
Felleisen also use instead of Xx.kx in CBV. 

4.3 Herbelin’s A-Calculus 

In Herbelin’s work^3|, A-terms are assigned to LJT which can be regarded as 
intuitionistic version of LKT. Thus our “classical” CBN CPS calculus includes 
“intuitionistic” A-calculus, naturally. We define inductive translation ()* as fol- 



CBN: C = \y.\k.y{\h.h{\k'.k'{Xx.\l.xk)){\h.kh)) 
CBV: C = Xy.Xk.y{Xx.Xl.kx){Xx.kx), 



lows: 



(Ax) 

(D) 



(L ^) 



(. [u :: l\) 




{k h) 

{x Xh.{. 1)*) 

{{h Xk.u*) Xh.{. 1)*) 
{k Xx.Xk.u*) 

((Afc.(. /)*) Xh.{. I'Y) 
{{Xk.u*) Xh.{. 1)*) 
{{Xx.{. 1)*) u*) 
{{Xx.v*) u*), 



(R ^) 



Xx.u* 



(h-cut) 

(h-cut) 

(m-cut) 

(m-cut) 



(. (z@0)* 




(. l[x := ti])* 
(u[a; := u])* 
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where k is the only continuation variable in the sequent. The fifth(first h-cut) 
clause shows that substitution for continuation variable: k by another argument 
list means concatenation of argument list. 

5 Conclusions and Further Directions 

We revealed that the CBN / CBV reduction scheme for “classical proofs as pro- 
grams” precisely corresponds to the Gentzen-style cut elimination procedure for 
LKT and LKQ. Our approach, cut elimination to be simulated by normaliza- 
tion, can be seen a new approach to Gentzen-style type theory. This merges 
Gentzen-style and natural deduction style, which means explicit substitutions 
are naturally included in our calculus as a computational step corresponding to 
permutation of cut. 

Besides CPS, on the line of “classical proofs as programs” approach, Parigot’s 
A/r-calculus muniEoiEi is also considered as a standard. DeGroote revealed 
the relation between CBN CPS and A/i-calculus |^. Moreover, Ong and Stewart 
(OS) introduce a call-by-value (CBV) version of A/i-calculus in Ea in which 
they introduce a new reduction rule (Carg)- Parigot proves the computational 
properties (such as SN and CR) of CBN versions of A/i-calculus individually. 
OS also does the same in CBV case. This seems to be enough circumstantial 
evidence that reduction system for CPS calculus and A/i-calculus are isomorphic 
on both CBN and CBV case. A/i-calculus can be presented as a term calculus 
directly on LKT and LKQ. Isomorphisms w.r.t reduction systems are direct 
consequence of decoration method and its reduction preserving properties. We 
are now working for the draft [T5j to show the precise relation between them. 
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Appendix 



t-. ^4, 

x(\h.t ) : (-i-iA)“’ 



s: f. A^-- 

{Xk.s)(Xh.t) : 4 > 



(h-cut) 



s: {-.Af^(j> t: (--Af 

{Xx.t){Xk.s) : (f) 



(m-cut) 



t: s: 

m{Xk.s){Xh.t) : (-1-1A ^ 



t: {^-.Af , {-.Bf ^ 4 > 

k(Xx.Xl.t) : ^ ^^B)'° 



Table 7. CBN CPS term assignment to intuitionistic decoration of LKT 



x{Xh.kh) : (-i-iA)^, 0 



s; t: (--A)" 

{Xx.t){Xk.s) : (f) 



t: {^^Ar,{^By ^4> 

m{Xx.Xl.t ) : -i(-i-iA — > 



(R — >n) 



t: s: i^AY 

{Xy.t){Xl.x{Xm.m{Xk'.s)l)) : ^ ^^BY ^ 4> 



(L 



Table 8. CBN CPS term assignment to intuitionistic decoration of LK 





78 



Ichiro Ogata 



V. S-. 

(\y.s)v. {-.Af=>C 



t: A^ ^(j> 



{{Xy.s)v) (k := Xx.t ) : C 



(m-cut) 



V. {^Af^B t: A^ 

V {k := Xx.t) : B 



converts to: 



t: A^^ip, s: B'>,[^A)*‘^C t: A^ ^ , 

(m-cut) (m-cut) 

: ^B ^ ' s{k:= Xx.t) :B^^C ^ ' 

{Xy.s {k := Xx.t)){v {k := Xx.t)) : C 



Table 9. Permutation of right premise of m-cut over t-cut with Implicit Con- 
traction 



kv : (-^Af ^0 t-. 

kv {k := Xx.t) : =^> (f> 



(m-cut) converts to: 



(Xx.t)v : A> 



Table 10. Permutation of right premise of m-cut over (D) 



t: A^,{-^By^<P 



Xx.M.t: ^ A ^ -'-'B 



converts to: 



^ <p ^ t' : B^ tj' : A 

-^^^B mv'iXy.t’): {A ^ 

{mv'{Xy.t')) {m := Xx.Xl.t) : (j> 

t: A^,{^By^4> t'-. B^ 
v' : =► A {Xl.t){Xy.t'): A^ ^ <j> 

{Xx.{Xl.t){Xy.t'))v' -. ^ <p 



— (t-cut) 

=A 4 > 

(m-cut) 

(t-cut) 



Table 11. Permutation of left premise of t-cut over (L — >) and (R — >) 
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Abstract. This tutorial is about design and proof of design of reli- 
able systems from unreliable components. It teaches the concept and 
techniques of fault-tolerance, at the same time building a formal theory 
where this property can be specihed and verihed. The theory eventu- 
ally supports a range of useful design techniques, especially for multiple 
faults. We extend CCS, its bisimulation equivalence and modal logic, 
under the driving principle that any claim about fault-tolerance should 
be invariant under the removal of faults from the assumptions (faults 
are unpredictable); this principle rejects the reduction of fault-tolerance 
to “correctness under all anticipated faults”. The theory is applied to 
the range of examples and eventually extended to include considerations 
of fault-tolerance and timing, under scheduling on the limited resources. 
This document describes the motivation and the contents of the tutorial. 



1 Motivation 

1.1 Why Fault- Tolerance? 

With growing complexity of computer systems and despite the progress in the 
technology of the basic components (software or hardware), the possibility that 
such systems are affected by faults is ever present. Some faults (design faults) 
could in theory be discovered off-line but even applying formal methods we 
cannot responsibly claim, except for the simplest of cases, to have discovered all. 
Other faults (hardware physical faults) are not even amenable to formal analysis 
as they can only manifest themselves on-line. This uncertainly as to the presence 
of faults calls for redundancy becoming part of a system. 

1.2 Why Prove Fault-Tolerance? 

With full verification of realistic systems being practically infeasible, we em- 
phasise the importance of verifying those parts of a system that are directly 
responsible for the management of redundancy: (1) More promise more risk - 
redundancy often leads to intricate management problems and may introduce 
design faults itself. (2) Critical parts - after a fault occurs it is crucial that 
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mechanisms to prevent a failure of an already impaired system are “correct” . (3) 
Faults are unpredictable - fault-tolerance implies “correctness under all antici- 
pated faults” but is the converse true? 

Finally, fault-tolerance is not without cost, it asks for more redundancy while 
performance asks for less. For given set of computing resources it is useful to 
capture this tradeoff (reliability versus performance) formally and study, even 
optimise, the effect of design decisions without actually building the system. 



1.3 Why Provable Fault- Tolerance? 



A common approach is to reduce provable fault-tolerance to provable correctness. 
When we claim fault-tolerance for a given implementation Impl relative to a fault 
assumption Faults and a specification Spec, we only proceed to prove correct- 
ness of an implementation T {Impl, Faults) which represents syntactically how 
Impl behaves in the presence of Faults CSHEISI; this reduction is most common 
without introducing the transformation T explicitly I 'ZWZ'Ai 1 81 1 WZ 1 1 1 41 1 . 

Although attractive for many reasons, e.g. reuse of a variety of tools and tech- 
niques already available for proving correctness, the method also raises some 
questions about its feasibility and applicability. 

Feasibility. Correctness under all anticipated faults is necessary for provable 
fault-tolerance but is it sufficient? After all, faults are unpredictable and even 
if we assume their presence they may never actually occur. Our claim should 
therefore be invariant under some, perhaps even all of such faults being removed 
from the assumptions. Consider a preorder < on the fault assumptions which 
represents the relative severity of faults: Faulti < Fault2- This represents that 
Faulti is less severe than Fault2', say Fault2 represents that a communication 
medium may both omit and permute messages and Faulti represents only omis- 
sion. Then we would expect that verifying Impl as tolerant of Fault 2 would im- 
mediately imply that it is tolerant of Faulti alone. We may also invent NoFault 
representing the strongest assumption (no faults) and expect that Impl, if toler- 
ant of any Fault, would also be tolerant of NoFault. NoFault < Fault for any 
Fault, tolerance with respect to NoFault could reasonably coincide with ‘plain’ 
correctness. Claims about fault-tolerance based on the implicit verification must 
be justified with respect to fault-monotonicity jO]! 

Applicability. Formal reasoning can be (should be) considered as a means 
to analyse as well as support design of ‘correct’ systems. How to support design 
of systems tolerant of faults? There are many issues in this case that do not 
appear in the broader context: design of a system which is correct with respect 
to its specification, without taking faults into account. A theory for provable 
fault-tolerance which is based on reducing fault-tolerance to correctness in the 
presence of all faults, may not bring effective help into such design issues. Among 
the issues is the growing complexity of a system for an increasing number of 
tolerated faults, along the chain NoFault < Faulti < ... < Faultn. We may 
like to have design techniques which support this dimension. 
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2 Contents 

The tutorial consists of five parts. They are: fault-tolerance, provable correctness, 
provable fault-tolerance, design for provable fault-tolerance and real-time and 
provable fault-tolerance. 



2.1 Fault- Tolerance 

We introduce the concept of fault-tolerance informally, placed in the broader 
context of system dependability and the means to achieve it: fault-avoidance, 
fault-prevention, fault-forecasting and fault-tolerance. We explain in general how 
to build systems with this property, why proving this property is important, and 
the ways we could approach the proof, first by reduction to provable correctness. 



2.2 Provable Correctness 

The basic framework includes process algebra (CCS, PI), its bisimulation se- 
mantics and associated modal logic 0 , all built on the domain of labelled 

transition systems m 

Here proving correctness of a fault-affected systems is insufficient by itself 
to imply that the original system is provably fault-tolerant. We show this by 
example: a version of the alternating bit protocol which is correct in the presence 
of faults but incorrect in the absence of faults. We also show a protocol which 
receives two proofs, in the absence of faults and in the presence of all faults, 
but fails the proof in the presence of some of the faults. The reduction does not 
work: fault-tolerance needs a proper semantic definition to be provably preserved 
under the removal of faults from the assumptions (they are unpredictable). 



2.3 Provable Fault- Tolerance 

We present ways to model faults semantically, by additional transitions in- 
troduced into the semantics of a process, all labelled by an internal action. 
A fault-description language is defined to describe and combine such fault as- 
sumptions (representing the presence of multiple faults) . This will use additional 
declarations for process constants and induce different fault-affected semantics 
of the process language, one for every fault-description term (a corresponding 
equivalence is introduced). Fault-tolerant bisimilarity is defined, stronger than 
bisimilarity but fault-monotonic. Its properties are investigated, among them is 
reduction to original bisimulations with some additional properties, and fault 
decomposition: proving fault-tolerance for multiple faults by proving properties 
of bisimulations for every fault separately mg. Finally, a logic is introduced 
with asymmetric modal operators, shown to provide a characterisation of the 
fault-tolerant bisimilarity and able to support reasoning about fault-tolerance 
itself: the logic is fault-monotonic (the original modal logic is not) . We use many 
examples to illustrate the relation and its properties. 
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2.4 Design for Provable Fault- Tolerance 

The relation enjoys many properties that help verification, as above. It also 
enjoys properties to help design for provable fault-tolerance. We derive a number 
of well-founded techniques, especially to alleviate complexity of the design for 
an increasing number of tolerated faults 0, as follows: 

1. Preserving fault-tolerance. 

Preservation rules that allow to modify design of a system while preserving 
(provably) the set of tolerated faults. Given Impl which is tolerant of Fault 
relative to Spec, it should be possible to modify Impl into Impl' , following 
the rules, and still be able to claim this property. This may help to simplify 
design, for given level of fault-tolerance, as well as tolerate more faults. 

2. Increasing fault-tolerance. 

Incremental refinement towards an increasing number of tolerated faults. We 
start with a system which is only correct with respect to its specification, 
then introduce faults incrementally, say given Faulti < Fault 2 and Impli, 
tolerant of Faulti with respect to Spec, we refine Impli into Impl 2 , following 
the preservation rules, which also becomes tolerant of Fault 2 - 

3. Deducing fault-tolerance. 

Separate development: deducing fault-tolerance of an overall system from 
fault-tolerance of its components. Suppose | is an operator of the language 
and Faulti V Fault 2 is the least upper bound of Faulti (i = 1,2) with 
respect to <. If Imph is tolerant of Faults Faulti with respect to Speci, if 
neither Impli is affected by Fault 2 nor Imph by Faulti, then Impli\Impl 2 
is tolerant of Fault V Vi=i 2 FO'Ulti with respect to Speci\SpeC 2 - 

We apply those techniques to the stepwise design of a distributed database, 
supporting atomic transactions despite failures of the underlying hardware. The 
development proceeds compositionally, separating the issues of concurrency and 
failures (both threaten atomicity) and given sequential, concurrent and dis- 
tributed transactions. The solutions include: stable storage, mutual exclusion, 
and two- and three-phase commit with reliable communication 0. 

2.5 Real-Time and Provable Fault- Tolerance 

The framework is extended to include consideration of fault-tolerance and tim- 
ing m The extension includes the proper semantic definition of time m, with 
algebra, logic and the model of faults then readily applied, and the sequence of 
increasingly realistic architectures for considering limited resources: processors, 
memory, clocks and network. These allow definitions of tasks that are: inde- 
pendent; communicate using the shared memory; are partitioned between the 
nodes of a distributed system, all connected by the multiple-access network and 
each providing local resources. We exemplify techniques for mapping tasks onto 
fault-free (static) or fault-affected (dynamic) resources, and how such techniques 
could be verified (schedulability) or synthesised for given timing requirements. 
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Rewriting techniques are now recognized as a fundamental concept in many 
areas of computer science including mechanized theorem proving and operational 
semantics of programming languages. 

From a conceptual as well as operational point of view, the notion of rewrite 
rule application is crucial. It leads immediately to the concept of rewriting strat- 
egy which fully defines the way several rules are applied. 

The combined concepts of rewrite rules and strateg ies are the first class 
objects of the programming language ELAN fRKK+98)H In this language, the 
actions to be performed are described using first-order conditional rewrite rules 
and the control is itself specified using strategies that can be non-deterministic. 
The use of these strategies is permitted directly in the rules via where statements. 
This provides a very natural way to describe e.g. theorem provers, constraint 
solvers, knowledge based reasoning techniques. Moreover such specifications can 
be executed very efficiently via new compilation techniques implemented in the 
ELAN compiler |MK98IVit96j . In the first part of our talk we will present these 
concepts and provide running examples of their use. 

Making the rule application an explicit object is the first step in the elabora- 
tion of the recently introduced rewriting calculus ICEnEl. The p-calculus, as we 
call it, provides abstraction through the rewriting arrow and explicit rule appli- 
cation. It also embeds the notion of sets of results to deal with non-deterministic 
computations. Furthermore, the calculus is parameterized by the matching algo- 
rithm used in order to fire the rules. In its simplest instance, /o-calculus embeds 
standard first-order rewriting as well as A-calculus. In the second part of the talk, 
we will introduce the /3-calculus and show how it provides a simple semantics for 
ELAN programs. 
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Abstract. Tried linear hashing is a combination of linear hashing and 
trie hashing. It expands the file gracefully as linear hashing, and organizes 
each chain of overflow buckets with the use of a trie to ensure that any 
bucket can be retrieved in one disk access. 



1 Introduction 

When a large volume of data is stored in a disk file, to retrieve the relevant 
data quickly for a given key, a hashing method which is able to compute the 
disk address of the record from the given key is needed. There are many hashing 
methods proposed in the past 2 decades. 

The easiest way to devise a hash function associated with a file, called the 
primary file, is to make use of modulo arithmetic. We regard a primary file as a 
table with T entries. The required hash function h is defined as h{k) = k mod T 
where fc is a given key, and k is inserted into bucket (or node) h(k). When a 
node is full, subsequent insertions will cause the node to overflow. 

The overflow problem can be resolved by either keeping all items in excess in 
an area for temporary holding, or they can be stored in separate chains, one for 
each node that overflows. The block in the primary file is called the home hloek 
of the chain, and those overflow blocks are grouped together in a file called the 
overflow file, or the seeondary file. Both solutions have increasing access cost as 
the overflow area or the overflow chains are ever growing. The performance will 
deteriorate as the volume of data increases. In order to keep the response time 
within a certain reasonable limit, the data stored in the primary and secondary 
files are to be unloaded and restored into a new primary file of bigger size. This 
is a rather expensive housekeeping routine that has to be carried out periodically 
and thus it discourages the use of hashing method in commercial applications. 

To handle the file expansion gracefully, more sophisticated methods are 
needed. The general 2-prong approach is to increase the file size a little at a 
time and to maintain the disk block addresses separately in a directory. When 
overflow occurs, a new block will be appended and items affected will be redis- 
tributed whereas unrelated items in other blocks will not be moved. The grid file 
na and the extendible hashing |n| are two such schemes. Both schemes require 
that the directories with entries containing pointers to the relevant disk blocks 
be maintained. 
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When the directory is small enough to reside in the memory, both methods 
guarantee the retrieval of any item in 1 disk access. If the directory is large, 2 
disk accesses will be required: one for reading the directory block that contains 
the pointer, and another to read in the data block by following the pointer. It is 
clear that when the volume of data is large, the size of directory will be large too. 
When data are clustered, it will cause excessive node splitting and the directory 
will be doubled very often with many entries being duplicated. This is a very 
serious weakness. 

Linear hashing CD] and spiral hashing ESI can be used to get rid of the di- 
rectory so that the data file will grow gracefully, but these methods have their 
weaknesses. Since our interest is on linear hashing and its variants, we will ex- 
clude spiral hashing from our discussion. Linear hashing can be improved in 
several ways. For example, it can be modified to expand partially so as to in- 
crease the storage utilization M to make it order preserving employing 
interpolation-based index 0, to optimize its performance for key-sequential ac- 
cess and to generalize it to multidimensional space Q. These variants of 
linear hashing use separate chaining to resolve the key collision problem. Only 
the recursive linear hashing HE! proposed by Ramamohanarao and Sacks uses a 
predetermined number of linear hashing files to organize the overflow blocks. 

On the other hand, Ilsoo Ahn proposed the use of filter buffer to describe 
all the overflow blocks. Torn EH proposed Overflow Indexing (OVI) to describe 
every overflowed key. Litwin HU proposed trie hashing in which only one trie 
was used, and other enhancements such as H2| to include controlled load. All 
these methods rely on the additional memory structure to ensure that given a 
key, the required record can always be retrieved in one read. 

In this paper, we propose tried linear hashing, which is a combination of linear 
hashing and trie hashing. We discuss linear hashing and trie hashing in sections 
2 and 3. In section 4, the tried linear hashing method is described. We present 
some empirical results for comparison in section 5 and conclude our presentation 
in section 6. 

2 Linear Hashing 

Linear hashing is a hashing method applied to data files that grow or shrink 
dynamically as data are being inserted or deleted, with high space utilization 
and no significant deterioration in access time when the files become very big. It 
requires a family of hash functions {h^} such that most of the time 2 consecutive 
functions hd and hd+i are used. Each hd is defined as hd ■ k ^ k mod N * 2‘^ 
where N is the number of buckets in the file at the beginning and d {d >= 0) 
the depth of the file. 

Initially, d = 0, and ho{k) = k mod N is used. All incoming keys are inserted 
into the buckets according to the address computed. When the file is so densely 
populated that it exceeds certain limit L, the load factor, the file is expanded 
by appending a block for redistribution of data at the end of the file. This will 
lower the storage utilization u, reduce the number of overflow blocks, and hence 
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maintain the cost of accessing the overflow chains at a reasonably low level. For 
the first overflow encountered, all keys in bucket 0 will be rehashed into either 
bucket 0 or bucket N according to the hashing function hi instead of Hq. The 
split pointer sp is used to point to the block to be split and it is initialized to 
0. After the block sp is split, sp will be incremented by 1. Thus all blocks will 
take turn to split. When block — 1 is split, the size of the file will have been 
doubled to 2N, sp will be reset to 0 and the depth d will be incremented by 1. 

Below is the algorithm used to insert a new key into a primary file of depth 
d using linear hashing. Note that u = n/t where n is the number of items (keys, 
records) inserted so far and t is the total number of slots provided. If the bucket 
capacity is b and the files (primary and secondary) have s buckets, then t = b*s. 

Linear Hashing Insertion (Key fc) 

BucketAddress p, pi 
StorageUtilization u 

p ^ hd{k) 

If p > sp 
then p = hd+i{k) 

Insert k into bucket p, or an overflow bucket when it is full 

Compute u 

Iiu>L 

then 

Pi ^ 2^^ * IV + sp 

Redistribute all keys in bucket p, its overflow buckets if any, and 

the new key k into buckets p and pi according to hd+i 

sp ^ sp+ 1 

If sp> 2‘^ * N 

then sp ^ 0; d ^ d + 1 

As the roving split pointer sp gives blindly each block a chance to split, very 
often a block will overflow before it is split. When this occurs, a block in the 
overflow file will be allocated to store the overflowed keys and linked to the home 
block. When a chain of overflow buckets becomes very long, the time to retrieve 
an item on it also gets longer. 

3 Trie Hashing 

Trie hashing is one of the fastest access methods for dynamic and ordered files. 
Its efficiency lies in the use of a trie. It starts out with a bucket in which all keys 
will be stored. When an overflow occurs, another bucket will be appended at 
the end of the primary file. All keys will then be redistributed into the overflow 
bucket and the new bucket just allocated by comparing the value of the first 
character of each key with a discriminator, which is a suitable value that will 
usually divide the keys evenly. A key having the first character smaller than or 
equal to the discriminator will go into the original bucket, otherwise it will go 
into the new bucket. No secondary file is needed. 
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The result of splitting the buckets is described in a trie with the discriminator 
and its associated position within the key stored in each internal node, and the 
bucket addresses stored in the leaf nodes. 

When the keys are numbers, a bit is used for comparison instead of using the 
whole character. As a result, the discriminators are not required to be stored in 
the internal nodes. During the search, each bit of the given key will be examined. 
If it is zero, proceed to the left subtree, otherwise go to the right subtree. This 
is the digital searching 0. 

We may describe the bit checking by a family of functions {s^} where Sd{k) = 
{k/2‘^) mod 2, d is the depth of the node in which Sd is being used. Below is the 
algorithm used to insert a new key k. 

Trie Hashing Insertion (Key k) 

TrieNode p 
integer d 

p ^ the root of the trie; 

d ^ 0; 

While {p is an internal node) 

If Sd(fc) = 0 

then p ^ p.left 
else p <— p.right 
d <— d + 1 
Read in p.bucket 
If {p.bucket is not full) insert k 
else 

Allocate one more bucket 

Perform bucket splitting and update the trie 

It is possible that after redistribution, all keys go into the same bucket and 
overflow again. This may result in multiple empty buckets being allocated and 
the depth of the trie will be increased by more than one. If the keys are uniformly 
distributed, these empty buckets will be filled subsequently. 

4 Tried Linear Hashing 

Although linear hashing allows the data file to grow gracefully, the existence of 
some long overflow chains will make the retrieval of items in these chains slow. 
On the other hand, although trie hashing allows us to access the overflow blocks 
through a trie, the number of nodes to visit is large when many keys are stored 
in the file. 

Intuitively, the combination of these two methods should give us a hashing 
algorithm in which the primary file can expand gracefully while the tries, used 
to describe the chains of overflow buckets, one trie for each home bucket with 
overflow buckets, are usually very small and require the visit of one or two 
nodes most of the time. Whenever the primary file expands, the overflow blocks 
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described in the trie affected may be lifted from the overflow file back into the 
primary file, and hence the height of the tries will not grow without bound. 

What is needed in maintaining many tries is an array trie[ ] of pointers to 
the roots of the tries. This array should be large enough for the primary file 
used. Below is the insertion algorithm of the tried linear hashing method. 

Tried Linear Hashing Insertion (Key fe) 

BucketAddress p, pi 
StorageUtilization u 
TriePointer trie[ ] 
integer d 

p ^ hd{k) 

If p > sp 
then p = hd+i{k) 

If {\trie[p]) 

then 

Insert k into bucket p 

Handle bucket overflow and the creation of trie accordingly. 

else 

Follow the trie structure trie[p] 

Insert k into the bucket found 

Handle bucket overflow and the updating of trie accordingly. 
Compute u 
ltu>L 
then 

Pi ^ 2^^ * IV + sp 

Redistribute all keys in bucket p, all of its overflow buckets, and 

the new key k into buckets p and pi according to hd+i 

sp ^ sp+ 1 

It sp >2‘^ * N 

then sp ^ 0; d + 1 

5 Empirical Results 

Several experiments have been conducted to find out how much saving in disk 
accesses can be achieved when tried linear hashing is used as compared to the 
original linear hashing. In the experiments, the node capacity is set to 30, the 
initial hash file size is 5 buckets and the load factor limit L is set at 0.6, 0.7, 
0.8, and 0.9. We created several files containing different number of items and 
measure the total number of node accesses during file creation. The statistics 
are listed in the following tables. 

From the tables, it is noticed that the number of disk accesses has been 
reduced by 5% for L = 0.6 to 62% for L = 0.9. This is quite a substantial 
improvement over the original linear hashing during file creation. When the 
load factor is increased, more overflow blocks are allocated and accessed. Since 
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Number of Items 


Linear (a) 


Tried Linear (b) 


b/a (%) 


1000 


2237 


2176 


97 


2000 


4556 


4362 


96 


3000 


7071 


6629 


94 


4000 


9264 


8787 


95 


5000 


11579 


10922 


94 


6000 


14095 


13272 


94 


7000 


16256 


15423 


95 


8000 


18454 


17549 


95 


9000 


20739 


19644 


95 


10000 


23181 


21848 


94 



Table 1. Hash file creation cost. Load factor = 0.6 



Number of Items 


Linear (a) 


Trie Linear (b) 


b/a (%) 


1000 


2516 


2185 


87 


2000 


5188 


4393 


85 


3000 


7748 


6649 


83 


4000 


10516 


8837 


84 


5000 


12880 


10899 


85 


6000 


15574 


12948 


83 


7000 


18753 


15560 


83 


8000 


20920 


17636 


84 


9000 


23226 


19696 


85 


10000 


25674 


21751 


85 



Table 2. Hash file creation cost. Load factor = 0.7 



Number of Items 


Linear (a) 


Trie Linear (b) 


b/a (%) 


1000 


3142 


2049 


65 


2000 


6741 


4098 


61 


3000 


10078 


6154 


61 


4000 


13861 


8179 


59 


5000 


17021 


10247 


60 


6000 


20593 


12299 


60 


7000 


24494 


14326 


58 


8000 


28385 


16370 


58 


9000 


31551 


18429 


58 


10000 


34865 


20487 


59 



Table 3. Hash file creation cost. Load factor = 0.8 
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Number of Items 


Linear (a) 


Tried Linear (b) 


b/a (%) 


1000 


4099 


2047 


50 


2000 


9746 


4096 


42 


3000 


14964 


6152 


41 


4000 


20292 


10245 


40 


5000 


25759 


10245 


40 


6000 


31365 


12297 


39 


7000 


37231 


14324 


38 


8000 


43067 


16368 


38 


9000 


49331 


18427 


37 


10000 


55109 


20485 


37 



Table 4. Hash file creation cost. Load factor = 0.9 



the accesses of the overflow blocks have been diverted to the associated tries 
in memory, the reduction in the number of accesses to the overflow blocks is 
expected to be greater as the load factor increases. This is confirmed by the 
outcomes of the experiments. 

The use of the split functions {s^} effectively turns the files into a structure 
created by a general bucket method of degree 2 in which each full bucket is 
split into 2 when it overflows. The storage utilization ) of this kind of structure 
is about 0.69. As a result, the load factor of tried linear hashing is about 0.7 
even when the load factor limit is set at 0.9. 

To have a fairer comparison on the retrieval cost, we use L = 0.7. The saving 
in file creation cost is 15% using our algorithm. As for the retrieval cost, it 
is always one with tried linear hashing. Based on the file with 10000 keys, the 
average retrieval cost for linear hashing is 1.09 and the maximum cost is 2. When 
L is 0.8 (0.9), the corresponding figures are 1.51 (2.62) and 3 (5) respectively. 

Note that when L = 0.7, the cost in retrieving a key from a file created 
through linear hashing or tried linear hashing are comparable when the keys 
are uniformly distributed. When the distribution of the keys are skewed or clus- 
tered, there will be many long chains of overflow buckets. The performance of 
tried linear hashing remains unchanged whereas that of linear hashing is greatly 
affected. 



6 Conclusion 

In this paper, we study linear hashing and trie hashing. Although linear hashing 
allows the data file to expand gradually online without the need to do an offline 
file expansion, it fails to organize its overflow buckets for quick retrieval. On the 
other hand, trie hashing imposes a trie structure onto the overflow buckets for 
efficient handling, but it requires many nodes of the trie to be visited. 

We propose to combine the two algorithms into one and call it tried linear 
hashing. The new hashing method has the best of both worlds. It expands the file 



Tried Linear Hashing 



93 



gradually as in linear hashing, and it organizes the overflow file through many 
small tries, one for each chain of overflow buckets as in trie hashing. In fact, the 
original trie hashing is a special case of our tried linear hashing in which only 
one bucket is used in the primary file and only one trie is used for the overflow 
buckets. 

An experiment is carried out to measure the cost involved in terms of number 
of disk accesses to insert various number of keys using linear hashing and tried 
linear hashing. It is noted that when the file is moderately loaded at 70%, the 
cost of file creation is reduced by about 15% and the retrieval cost is guaranteed 
to be 1 whereas linear hashing requires 1.09. The cost saving is greater when the 
load factor is higher. 

In another paper | 2 |, we describe a new hashing method Filtered linear hash- 
ing in which linear hashing is being combined with Altered hashing. Both tried 
linear hashing and Altered linear hashing are equally efflcient in terms of the file 
creation and key retrieval cost. In view of the complexity of merging the empty 
buckets when substantial deletions are performed in Altered linear hashing, the 
use of tried linear hashing is preferred. 
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Abstract. The magic-sets method is a basic query optimization method 
in the deductive database systems. However, the original magic-sets 
method may generate large magic predicates for recursive queries. In 
this case, the evaluation of the magic predicates dominate the whole 
evaluation cost. Factorized magic sets can limit the sizes of generated 
magic predicates by splitting some magic predicates. However, it suffers 
from a new “over-splitting” problem. In this paper, we focus on a prob- 
lem: what is the best splitting schema for a magic predicate, given a 
magic program. We propose a hypergraph model to represent the magic 
program as well as its naive evaluation procedure. An intuition is a magic 
predicate whose arguments belong to different connected components in 
infinite number of its generated graphs is considered to be a big one. It 
thus should be split. Based on the hypergraph model, we propose a new 
concept, called c-partition, as the best splitting of a magic predicate. 
Although we still do not know how to construct a c-partition, we define 
a serial of d[fc]-partitions to approximate the c-partition. We prove that 
d[fc] -partition is better then the existing splitting algorithm. Our method 
is a global splitting strategy for magic predicates, in the sense that it de- 
cides whether or not to split a magic predicate by considering the whole 
program. 

keywords: deductive databases, query optimization, magic sets, factoring 



1 Introduction 

The magic-sets is an efficient algorithm for optimizing recursive as well as non- 
recursive queries in deductive databases and rule-based systems, and imple- 
mented as a standard facility in many prototype systems PQ. The magic-sets 
is a technique that allows us to rewrite the rules for each query form so that the 
advantages of top-down and bottom-up methods are combined. The method is 
effective (optimal) in the sense that it generates no unnecessary facts. 

However, the magic-sets method assumes that the size of each magic predi- 
cate is usually smaller than the corresponding recursive predicate. Therefore, the 
magic-sets method generates magic predicates as many as possible, and maxi- 
mizes the arity of each magic predicate. This strategy may result in generating 
large magic predicates 
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Example 1. Consider the following program E. It is a typical same-generation 
program. 

sg{X,Y) : -flat{X,Y). 

sg\x,Y) : —up{X, U)down(Y,V)sg{U,V). 

Assume that the given query is sg(l,l), that is, both the first and second ar- 
guments are bound. Then a typical magic-sets transformation of the program, 
denoted as T"*®, would be 

sg“(A,y) : -m-sg^^{X,Y)flat{X,Y). 

sg'^\X,Y) : -msg^\X,Y)up{X,U)down{Y,V)sg'^'^{U,V). 

m-sg^^{U,V) : —m-sg^^{X,Y)up{X, U)down(Y,V). 

1 ). 

where, m-sg^^{U, V) is magic predicate. Its definition rules (the third and fourth 
ones in the program) are called magic rules. The set of magic rules is called magic 
program. □ 

In T"*®, the magic predicate rri-sg^^ acts as a filter to restrict the computing 
space of a bottom-up evaluation of the recursive predicate sg^^ . The superior 
letter bb is called adornment of the corresponding predicate, which represents the 
bound information for that predicate, bb means two arguments of the predicate 
sg are bounded by the query constant. 

Experience from relation databases showed that set-at-a-time operations are 
very important for database application. Hence, to answer a query p(c. A), where 
c is a vector of bound arguments, to a program P, a bottom-up semi-naive 
evaluation is usually employed in database application. However, for a recursive 
query, it may contain many temporary results in its evaluation which have no 
contribution to the final query results. In order to reduce the computing space 
of the semi-naive evaluation, the original program is modified by the magic-sets 
method [llXI7| . The basic idea of the magic-sets method is to convey the bound 
information of the query to rule definitions so as to reduce the relative facts set 
of the bottom-up semi-naive evaluation. The magic-sets method constructs one 
magic predicate, denoted as m-p, for every IDB predicate p in each rule in the 
original program. The original rules to define p are modified by inserting m_p in 
the body of the rule. That is, the modified rule has the form p : —rri-p, body{p), 
where bodyip) represents the body of the original rule. m_p thus acts as a filter 
to evaluate the p. Those tuples that are in body(p) but not in m_p are excluded 
in the evaluation. Clearly, the smaller the relation m-p, the more efficient the 
bottom-up semi-naive evaluation. We denote the magic-set program of P by 

pmg 

Ullman[0| and Seki[3| showed that the size of search space represented by 
the stored relation sg^^ is equal to the search space of the top-down evaluation. 
Hence, magic-set program is optimal in the sense that it does not generate unnec- 
essary facts. Unfortunately, the size of m-sg’’^ is possibly very large. Let us con- 
sider an EDB in Exampledas follows: up = {(1, 2), (2, 3), • • • , (n — 1, n), (n, 1)}, 
down = {(1, 2), (2, 3), • • • , (n — 2, n — 1), (n— 1, 1)}. Then magic predicate m-sg^^ 
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is the set {{i,j)\i = 1, • • • ,n; j = 1, . . . ,n — 1}, which size is bounded by O(n^). 
In this case, the magic-sets processing actually computes the Cartesian product 
of relations up and down. 

For overcoming this problem, a way of factoring magic predicates is proposed 
called factorized magic-sets. Magic predicates are repeatedly split until 
every magic rules (rule whose head is a magic predicate) satisfies the following 
conditions: 

(1) The non-magic predicates in the body constitute a connected set. 

(2) Every variable in the head of a magic rule appears in some non-magic 

predicate or the body contains only magic predicate. 

In other words, a magic predicate m-p{X, Y) should be partitioned if there is a 
rule to define m-p{X,Y), which contains at least two connected components in 
the body, each contains X and Y respectively, or X belongs to a magic predicate 
and does not appear in any non-magic predicates, where X and Y are two vectors 
of arguments of p. 

Example 2. The magic predicate m-sg^^ in Example Q is defined by a rule 
in which body there are two disconnected base predicates up and down, thus 
it should be partitioned into two smaller magic predicates, bf-m-sg^^ and 
fb-m-sg^^, as follows. 

bf-m-sg^^{U) : —bf-m-sg^^{X)up{X, U). 
fb-m-sg^^{V) : — fb-m-sg^^{Y)down{Y,V). 
bf-m-sg^^{l). 

/6_m_sg“(l). 

Every magic predicate is unary now. So, they do not produce big relations ob- 
viously. □ 

Sippu and Soisalon-SoininenjTj proved that the size of a factorized magic 
predicate is linear in the size of the largest temporary result that is the join of a 
set of connected non-magic predicates in the definition of the magic predicate. 
However, as being pointed out by themselves, factoring may result in worse 
behavior as compared to original magic predicates. That is, the original magic 
predicates may be a stronger filter for basic (semi-)naive evaluation than the 
factorized one. Intuitively, if m-p{X,Y) is not a big one, we should compute it 
explicitly, because the size of m-p{X, Y) is less than that of m-p{X) x m-p{Y), 
where m-p(X) and m-p(Y) are the project of m-p(X, Y) respectively. However, 
if m-p{X, Y) is a big one. Constructing m,-p{X, Y) explicitly may dominate the 
cost of whole evaluation. Hence the key is to estimate the size of magic predicates. 

Sippu and Soisalon-Soininen’s strategy 0 decides the size of a magic predi- 
cate according to the property of each definition rule. We thus call it as a local 
splitting algorithm. This strategy may split some magic predicate too small. We 
call this problem “over-splitting”. It means the factored magic predicates may 
weaken the filtering effect in the bottom-up evaluation. Let us see the following 
example. 
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Example 3. Consider the following magic rules 

m-q'>^{U,V) : {X,Y)d{X,Y,U,V). 

: -m-q'^^\x,Y)a\x,U)b{Y,V). 

By the factorized magic-sets method, rri-p^^^ (X,Y) will be split to two smaller 
unary magic predicates: 

m_q^’>{U,V) : - fbf_m-p^’>f {X)bf f-m-p^'^f {Y)d{X,Y,U,V). 
fbf-m-p'^^f (U) : -m_g“(X,r)a(X, U). 
bff.m-p'^^f lv) : -m-q'^\X,Y)b{Y,V). 

If we transform the above magic program to the following equivalent form, 
both magic predicates are defined by rules in which body all base predicates are 
connected. By Sippu and Soisalon-Soininen’s splitting criterion, both of them 
need not to be split. 

m_q^\U,V) : -m-q’>^{S,T)a{S, X)b{T,Y)d{X,Y,U,V), 

: -m-p'^'^f {S,T)d{S,T,X,Y)a{X,U)b{Y,V), 
m_p“/(C/,y) : -a{l,U)b{l,V), 



□ 

This example give us a hint that if we consider the global property of all rules 
that define the magic predicate, we are possible to overcome the “over-splitting” 
problem. In this paper, we focus on such a problem: what is the best splitting 
schema for magic predicates in a magic program. 

The remainder is organized as follows: Section 2 introduces some basic con- 
cepts as well as Sippu and Soisalon-Soininen’s factorized magic-sets method. 
Section 3 proposes a hypergraph model to represent the magic program as well 
as its naive evaluation procedure. We then propose a new criterion for splitting: 
a magic predicate whose arguments belong to different connected components 
in infinite number of its generated graphs should be split. We summarize it as 
a concept called c-partition. Section 4 defines a concept, called d-partition, to 
approximate the optimal splitting schema. We prove that the d-partitions is 
better than Sippu and Soisalon-Soininen’s factoring. This result enhances the 
availability of the factorized magic-sets method. Section 5 concludes this paper. 

2 Recursive Queries, Magic-Sets, and Factorization 

We first briefly introduce the basic concepts of recursive queries, magic-sets 
in Datalog language, and Sippu and Soisalon-Soininen’s factorized magic-sets 
method. The details of Datalog language refers to Ullman 0. The details of the 
factorized method refers to |3- 
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Assume that there is an underlying first-order language without function 
symbols. A (Datalog) program^ is a finite set of clauses called rules of the 
form: 

R-.p{X) : 

where m > 0; p{X), called the head, is an atom of an ordinary predicate, and 
Px{Xx) ■ ■ -p-miXm), called the body, stands for the conjunction pi{Xi) A ... A 
Pm(Xm); each Pi{Xi) is called the subgoal (predicate) and is an atom of either 
an ordinary predicate or a built-in predicate. A rule with an empty body is a 
fact, which contains no variables. A query is a rule without a head, with some 
of its variables possibly bound to constants. A predicate is called base predicate 
(or EDB predicate) if it is not a head in the program. Otherwise it is called 
derived predicate (or IDB predicate). A derived predicate is called recursive if 
it is contained in a cycle in the dependency graph of a program, which has all 
predicates of the program as the nodes and has an edge from p to q if p is found 
in the body and q is found in the head of the same rule in the program. The IDB 
predicates in the same cycle are called mutually recursive predicates. A rule is 
recursive if it contains some recursive predicates. Nonrecursive rules are called 
exit rules, and the corresponding base predicate is called exit predicate. 

Definition 1. Let C he the set of predieates. Two variables, X and Y , are 
eonneeted in C if they appear in a predieate in C or there is another variable Z 
such that X and Z are in a predicate in C and both Z and Y are connected in 
C. Two predicates are connected if they each contain one of a pair of connected 
variables. 

Let C be the set of all EDB predicates of a rule. Connectivity among both 
variables and predicates in C is an equivalence relation. It splits all predicates 
in C into some connected components. We say a rule is a single-connected- 
component rule if all EDB predicates are connected, otherwise the rule is a 
multiple-connected-component rule . 

Example 4- In Example Q], Let C = {up, down} be the set of base predicates of 
the second rule. C contains two connected components. So the second rule is a 
two-connected-component rule. □ 

Although a rule is possibly a multiple-connected-component rule, we require 
that the rule become single-connected-component one if we add the IDB predi- 
cates into the predicate set C. Furthermore, we require that all IDB predicates 
are rectified. That is, its variables are distinct. Otherwise, we can replace them 
by different variables and introduce an equal predicate between them. We also 
require that the rules are safe, that is, all variables in the head of a rule should 
appear in the body of that rule also. 

Two important concepts in magic-sets transformation are adornments and 
sideways information passing strategy (sips for short) PJBCI ■ An adornment is 
an annotation on a predicate that gives information about how the predicate is 
used. The typical adornment system contains only bound (b for short) and free(/ 
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for short) adornments. For example, the adornment of the query in Example ^ 
sg’’^ means that the first and second arguments of sg are bound. A sips is a 
decision on how to pass the boundings available in the body of a rule while 
evaluating the rule. However, the choice of the best sips for a magic program is 
beyond this paper. We simply choice a full sips, which propagates all the binding 
information from the rules’ head to the rule’s body and between the subgoals, 
and reorder all predicates in the body from left to right that obeys the partial 
order defined by the sips. 

The details of the typical magic-sets method is omitted here. Our purpose 
is to overcome the problem of the existing factorized magic-set method that we 
mentioned in previous section. Factoring magic predicates is done over the magic 
program generated by the typical magic-set method. Hence, in this paper, we 
assume that we have had a magic program without concerning how to generate. 
Hence, we only introduce Sippu and Soisalon-Soininen’s factorized magic-sets 
algorithm jZ) here. 

The kernel of the algorithm is a factoring algorithm. It replaces the magic 
predicate, which variables are not connected in the set of EDB predicates of 
the rule, by its b- factors. At first, we define the concept of b- factoring of an 
adornment and their product operation. 

Definition 2 . Let a and (3 he adornments for an n-ary predicate p, which are 
considered as bit vectors with b — 1 and / = 0. The disjunction of a and (3, 
denoted as a\J (3, is the bitwise “or” of a and (3; the conjunction of a and (3, 
denoted as a/\(3 is the bitwise “and” of a and f3; and the negation of a, denoted 
as ~^a, is the bitwise “not” of a. 

Definition 3. a<(3ifa\/j3 = (3;a<(3 if a<(3 and a ^ (3. 

Definition 4. A set of { oi, • • • , am } of adornments for a k-ary predicate p is 
a b-factoring of an adornment a for p, z/ oi V • • • V am = ct and, for all i ^ j, 
aiAaj = f^, where represents f written k times. Each ai is called a b-factor 
of a. 

Definition 5. Let F = {a\, • • • , am} and G = {/3i, • • • , /3„} be b-factorings of 
an adornment a of a k-ary predicate. The product of these factorings, denoted 
by F X G, is defined as {ai A /3j\i = 1, • • • , m, j = 1, • • • , n}\{/^} 

Obviously, the product of two b-factorings of a is always a b-factoring of a. 

Example 5. Consider the adornment o = 66 of the IDB predicate m-sg in the 
second rule of Example [B { 6 /, / 6 } is a b-factoring of a, because bf V fb — 66 , 
and bf A fb= //. □ 

Definition 6 . Let F and G be two b-factoring of a. G is a refinement of F, 
denoted by G < F, if for every factor P G F there are factors 71 , • • • , 7 n G G 
such that /3 = 7 i V • • • V 7 „, that is, { 71 , • • • , 7 „} is a b-factoring of /3. 
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Next, we introduce factoring algorithm. It takes as input a magic-sets trans- 
formed program P™9 of a safe datalog program P with rectified rule heads and 
rectified IDB subgoals. The algorithm produces, for each adorned IDB predicate 
p“, a b-factoring Bp^a of the adornment a. 



Algorithm 1 .■ (Factoring) 

Input: 

Output: Bp^a, for all derived predicates p in 
Method: 

(1) . Bp^a = {ot\ for each adorned IDB predicate p“. 

(2) . For each magic rule 

R : m_r^(t/) : -m_p“(f/)pi(Ai) • ■ ■ pk{Xk) 

in let C be the set of the non-magic subgoals of R, which is partitioned 

into a set of connected components, say C\, - ■ ■ , Cm^rn > 1. 

(2.1) [splitting those arguments whose variables appear in EDB predicates] 

Replace Br^p by B^^p x {Pi,- ■ ■ where Pi < P (i = l,---,m) 

be the adornment for the predicate r that designates as bound exactly 
those arguments in m-r^{V) that appear in some subgoal in Ci; 5 = 
P A ~^{Pi V • • • V Pm)- {Pi, ■ ■ ■ , Pm, is a b-factoring of p. 

(2.2) [splitting those arguments whose variables appear only in the magic 
predicate in the body] 

If V contains two variables that appear only in U in the body of R and 
that belong to different factors in Bp^a, then the B^^p must be refined 
further by replaced with Br^p x ({/3} U {P-y\^ € Bp^a}), where P < P is 
the adornment for r that designates as bound exactly those variables in 
m-r^{V) that appear in X\, - - - , Xk! P-y < P (l & Bp^a) is the adornment 
for r that designates as bound exactly those variables V in m-r^{V) that 
satisfy the following conditions: 

— V does not appear in X\ - - - Xk, and 

— V appears in m-p°‘(U) in a position that is also designated as bound 
by 7 . 

Clearly, {p} U {P-y\^ G is a b-factoring of p. 

This step should be repeated until no change occurs. □ 



Example 6. Consider the magic predicate mg-sg in Example 0 There is only 
one rule to define this magic predicate. C = {up{X,U),down{Y,V)), which is 
partitioned into two connected components. Hence the adornment {bb} of mg^sg 
is replaced by its b-factoring {bf, fb}. 

Based on this factoring algorithm, the original magic program can be revised 
by splitting those magic predicates which corresponding adornment set contains 
more than two elements. [Z|. 
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Algorithm 2 (Factorized magic program) 

Input: Bp^a, 

Output: 

Method: 

(1) For each factor 7 G ^ 9 , 7 , there is a seed 

where 'if consists of the hound arguments from q^{Y) 

(2) For each magic rule 

m_r^{V) : -m_p“(t/)pi(Ai) • ■■pk{Xk). 

in and for each factor (3 G -Br,/?, there is a factored magic rule. The head 
of the factorized magic rule is f3-m-r^{W), where W consists of those argu- 
ments from V that are designated as bound by j3. The body of the factorized 
magic rule is constructed as follows. If W is contained in U , then the body 
consists of the subgoal aj-m-p°‘(Uj), where Uj G Bp^a, and Uj C tj that 
are designed as bound by aj (1 < j < m). If W is contained in X\, • • • , X^, 
then the body contains every subgoal pi{X{), where XiC\W ^ <p> o,nd all 
those subgoals that are connected to pi{Xi) via a path not going through any 
subgoal pi{Xi),i > 1. 

(3) For each modified rule 

in there is a modified rule 

p“(A) : -ai_m_p“(C7i) • • • o^_m_p“(t7^)pi(Ai), • • 

where Bp^a = {ai, ■ ■ ■ ,am} and Ui(i = 1, - ■ ■ ,m) consists of those arguments 
in that are designated as bound by ai. □ 

Example 1. The adornment set of the magic predicate m-sg in Example Q con- 
tains two b-factors. Hence m-sg should be split. The result magic program is 
the one listed in Example |21 □ 

3 A Hypergraph Model for Magic Programs 

The above factoring algorithm decides splitting of a magic predicate according 
to the connectivity in the set of the non-magic predicates of each definition rule. 
That is, the factoring is based on the local property of the rule. This is the reason 
that the factoring algorithm suffers the “over-splitting” problem we mentioned 
in Introduction. To tackle the optimal splitting problem, we need a power tool 
to represent the global property of the magic program easily. In this section, 
we propose a hypergraph model to represent the magic program as well as its 
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naive evaluation procedure. Corresponding to a magic predicate, there are a set 
of generated graphs. We see that a magic predicate may produce a big relation 
if its arguments belong to different connected components in infinite generated 
graphs. Thus this kind of magic predicate should be split. 

According to the magic-sets method, we know that the magic predicates 
may be mutually recursive with some IDB predicates. For evaluating a nonlinear 
multiple-rule mutually-recursive query, its (semi-)naive algorithm can be viewed 
as a loop of two steps until there is no change occur: 

(1) evaluating all magic predicates by fixing all IDB predicates (their initial 
values are empty); 

(2) evaluating all IDB predicates by using the obtained magic predicates. 

In this research, we pay attention to the sizes of magic predicates. Hence, IDB 
predicates in evaluating magic predicates can be treated as same as EDB predi- 
cates in one iterative step. That is, we can assume the size of any IDB predicate 
is not big in each iteration. Furthermore, each magic rule contains exact one 
magic predicate in its body. All rules are safe and all subgoals in the body of a 
rule, including those magic subgoals, are connected. As the result, the magic pro- 
grams of a Datalog program is a safe, rectified, multiple-rule, linearly-recursive 
program. 

Therefore, in the following discussion, we consider a multiple-rule linearly- 
recursive program. The magic rules are safe and rectified. Furthermore, for sim- 
plicity, we omit the adornmentations of the magic predicates as well as the prefix 
“to_” of the magic predicates. The other predicates in the body, including IDB 
predicates and EDB predicates, are called non-magic predicates. 

Definition 7. Let 



R-.p{U) : -r{V)pi{Xi)---p^{X^) 

be a rule. A labeled hypergraph, denoted by G{R) = (V, E, L~^ , L~), is defined for 
R as follows. 

(1) V is a finite, nonempty set of nodes. A node is defined by a variable in the 
rule. Furthermore, a node is labeled by X itself if X is a variable appearing 
only in some non-magic predicate of the rule; a node is labeled by pf if the 
corresponding variable appears in the i-th position of the magic predicate 
p (head of the rule); a node is labeled by r~ if the corresponding variable 
appears in the i-th position of the magic predicate r (in the body of the rule). 
We call a node with mark “-t-” a positive node, and that with mark a 
negative node. For convenience, we call the label X v-label (variable-label), 
and the label like pf,r~ a-label (argument-label). 

(2) E is a finite, nonempty set of undirected hyperedges such that a hyperedge 
is defined and labeled by q among nodes corresponding to variables X if q{X) 
is a non-magic predicate in the rule. 

(3) L+ is the set of all a-labels of positive nodes 

(4) L~ is the set of all a-labels of negative nodes 
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If L~{G) = 4>, the graph is called E-graph, otherwise I-graph. If L'^{G) = L~{G) 
except the “-h” and mark, the graph is called self-recursive graph. The node 
not appearing in any hyperedge is called dangling node. 



Example 8. The graph representation of the magic rules in Example 0 is showed 
in Fig. QJ The first rule consists of a unique hyperedge, and the second rule 
contains two separated components. The third rule can be viewed as a rule as 
q{X,Y) : —e{X,Y), where e{x,Y) contains only one tuple (1,1). □ 




r ^ 

oP ^ 

a 




V V 



o CJ on 

1 e ^2 



(b) (c) 

Fig. 1. Hypergraph representation for a magic program, (a) and (b) are I-graphs, 
(c) is E-graph 



Two edges a and b are connected if they contains at least one common node, 
or a is connected with a hyperedge c and c contains at least one common node 
with b. Two nodes are connected if there appear in a set of connected hyperedges. 
A graph G{R) is called connected if all of its positive nodes are connected. 

Definition 8. (Definition Graph) Let P = {i?i, • • • , i?m} be a program, where 
Ri are rules. Then P can be represented by a set of graphs G{P) = {G(i?i),- • •, 
G{Rm)}, which are divided into two parts: Gi{P) = {S G G{P)\S is an I-graph}; 
Ge{P) = {S G G{P)\S is an E-graph}; G{P) = {Gi{P),Ge{P)) is called defini- 
tion graph of P. 



Definition 9. (Gluing) Let S = G(i?i) and T = G(i? 2 ) be two graphs defined 
for two rules Ri and i? 2 , respectively. Gluing S and T, denoted by S * T, is a 
graph if the negative nodes of S and positive nodes of T has the same a-labels 
except -\- and —. The corresponding nodes are combined in the new graph, and 
are assigned with new v-labels. Their previous a-labels are deleted. Otherwise, it 
is an empty graph (incompatible). If G and H are two sets of graphs, G * H = 
{S *T\S gG,T G H}. 
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Gluing operation is similar to the resolution operation in logic programming 
or join operation in relational databases. When we use the second rule to resolve 
rri-p in the first one, we obtain a new rule which corresponding graph is exactly 
the one obtained by gluing the graph of the first rule and that of the second 
rule. 




Fig. 2. Gluing the graph (a) with (b) in Fig. 1 



Hence, the naive evaluation of a program P can be represented by gluings of 
the graphs in G{P) each other. 

Definition 10. (k-th I-graph) Let Gi{P) and Ge{P) be a pair of I- graph and 
E-graph for a program P. The k-th ( order) I-graphs ofGi{P), denoted by G^{P), 
are defined iteratively as follows: 

G}{P) = G,{P); 

GUP)=Gt\P)*G.{P) 

Definition 11. (k-th E-graph) Let Gi{P) and Ge{P) be a pair of L- graph and 
E-graph for a program P. The k-th (order) E-graphs, denoted by R^{P), are 
defined as follows: 

R^{P) = Ge{P); 

R>^{P) = G^{P) * Ge{P) 

We denote G*{P) = U^i(Gj^(P)), R*{P) = U^i(i?'=(P)), and Gen(P) = 
U^i(G,^(P)UP'=(P)) = G*{P)UR*{P). G*(P), R*(P), and Gen{P) are called 
generated I-graph, generated E-graph, and generated graph, respectively. Simi- 
larly, for a magic predicate p, we denote G*{p), R*{p) and Gen{p) as the subset 
of graphs in G*{P), R*{P) and Gen{P), respectively, whose positive nodes are 
marked as pf . 

Lemma 1. Let P be a program, then 

(1) Gr(p) = (G.(p)r 

(2) P'"+"(P) = G™(P) * i?"(P) 

where (Gi{P))™ is a shorthand for Gi{P) * • • • * Gi{P) (m—1 gluing operations) 
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Proof. (1). It is proved by induction on m. At first, G\{P) = Gi{P) from the 
definition. Next, we have G^{P) = G^~\P) * Gi{P) = (Gi(P))™-i * G,(P) = 
(G.(P))™. 

(2). By the definition and (1), we have 



G™(P) * = G™(P) * Gf (P) * Ge(P) 

= (G,(P))-*(G,(P))"*Ge(P) 
= (G,(P))"*+"*Ge(P) 

= G™+"(P)*Ge(P) 

= P"*+"(P) 



;P"(P) = Gr(P)*Ge(P) 

;(i) 

; definition 

;(i) 

; definition. 



□ 



Next, we prove an important theorem. 

Theorem 1. (Transformation Theorem) Let (Gf^(P), U^“ qP^(P)) be a hyper- 
graph defining a new program, say Q, where G^{P) is the k-th I-graph and 
W{P){j = 0, - ■ ■ ,k — 1) are j-th E-graphs of P. Then P*(P) = R*{Q). 



Proof. (1) For an arbitrary graph S G R*{P), assume it is in P^(P), that is, it 
is an /-th E-graph. If I < k then P^(P) C Ge{Q) Q R*{Q). If I > k, assume 
I = a * k + b, we have P'(P) = P“*'=+''(P) = G(*'^{P) * R^{P) = {G'l{P)Y * 
Rb{P) = G“(Q) * R^{P) C G“(Q) * Ge(Q), hence R\P) C P“(Q) C R*{Q). 
Hence, S G R*{Q), R*{P) C R*{Q). 

(2) For an arbitrary S G R*{Q), assume it is in R\Q). Since R^{Q) = G\{Q)* 
Ge(Q) = (G,(Q))'*Ge(Q) = (G,^(P))'*(U^to^^(^)) = Gr'(P) * (U^-'P^P)), 
we have R\Q) C P*(P). Hence, S gR*{P), R*{Q) C R*{P) □ 



In relational database theory, we usually assume that the size of predicate 
which is defined by a connected nonrecursive relation expression is not very 
big. In our graph model natation. The predicate defined by an E-graph is not 
a big one, because any E-graph is a single-connected-component graph. Hence, 
for a magic predicate p, if there are only finite number of I-graphs in G*(p) 
whose positive nodes belong to different connected components, p is not a big 
relation. Because those connected components will become connected in a new 
program, according to Theorem [D Assume k is the largest order of those I- 
graphs whose positive nodes belong to different connected components, then 
(Gf (P), U^“gP-^(P)) is a new program Q which is equal to the original program 
but contains no more then two connected components in its G* (Q). Hence, magic 
predicate p will not produce big relation in Q. 

However, if there are infinite number of such I-graph in G* (p), the correspond- 
ing magic predicate may form a big relation very easily. Since each connected 
component forms at lease one cycle in data independently with the other con- 
nected components. Let the largest length cycle in each connected component 
be Cl, • • • , Cfe, respectively. If all of cycles Ci are relatively prime, the size of the 
result relation of the magic predicate is ciC 2 • • • Cfe. Hence, the magic predicate 
produce a big one in this case. 
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From this intuition, we need only to identify and factor those magic predicates 
p whose G* (p) contains infinite number of I-graphs which contain more then 2 
connected components. In the next section, we propose an algorithm to identify 
such magic predicates. 



4 Arguments Partitioning 

In this section, we tackle the optimal splitting problem based on the proposed 
graph model. From the analysis of previous sections, we know that those argu- 
ments which are not connected in finite number of generated graphs should be 
split. Otherwise, they should be remained in a same predicate. We summarize it 
as the following definition. 

Definition 12. (c-partition) Given a set of graphs Gen{p). A partition = 
{c^i \ • • • , of the positive nodes in each graph in Gen{p) is called a c-partition 
of the magic predicate p if 

(1) For any given positive integer K, there are k > K graphs in Gen{p) in 

which ps € and pt G 7 ^ j) ore not connected. 

(2) There are at most finite number of graphs in Gen{p) in which ps G and 

(p) 

Pt G cl are not connected. 

is called a c-block of the c-partition Brackets are used to represent a 
block of the c-partition. 

c-partition is the optimal splitting schema for magic predicates. 

Theorem 2. Let = {c^\\ ■ ■ ■ , c^k'^ } be a c-partition of the magic predicate 
p. Assume Ps G and pt G (s yf t) are two positive nodes in the same 

block. Then there exist a finite integer, such that Ps and pt are connected in all 
generated graphs of the graph G{Q) = (Gf^(p), U^“gi?^(p)) 

Proof. According to the definition of the c-partition, there are only finite number 
of graphs in Gen{p) in which pf and p^ are not connected. Let K be the largest 
order of the I-graphs in these graphs. Consider the graph {Gf{p),\J^~QW{p)). 
In the graph, pf and pf are connected in all E-graphs R\p){j = 0, - ■ ■ , K — 1), 
because those I-graphs in Gen{p) in which pf and pf are not connected become 
connected R-graphs in G{Q). pf and pf are connected in all I-graphs of G{Q). 

□ 

This theorem says if the arguments in the same c-block are defined as a new 
predicate, this predicate will not produce a big relation. 

Theorem 3. c-partition is unique for a given Gen{p). That means if and 
are two c-partitions of Gen{p), then 
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Proof. Let = {c^\ \ ■ ■ ■ , } and • • • , be two different 

c-partitions of the magic predicate p. Without loss of generality, assume Ps € 
and Ps G \ but According to the definition of c-partition, 

there exist an integer kc, such that all arguments in become connected 
in all generated graph of Similarly, there exist an in- 

teger kd, such that all arguments in of become connected in all generated 
graphs of Let k = max{kc, kd). Then in all generated graphs 

{G^{p), {p)), the arguments in and become connected, since both 

contain ps. By the definition of c-partition, should be as the same as 
It is a contradiction with the assumption. □ 

This theorem says that the optimal splitting schema for a specific magic predicate 
is unique. Hence, we need only an algorithm to construct a c-partition without 
concerning if it is the best. Unfortunately, we have no an algorithm to construct 
the c-partition recently. Instead, we propose a concept to approximate the c- 
partition in the sequel. 

Definition 13. Given a set of graphs Gen{p). A partition = ’ •> 

of the positive nodes in the graphs Gen{p) is called a d[k] -partition (or d-partition 
when we need not to emphasize the order) of the magic predicate p if there are 
at most finite number of graphs in Gen{p) which order is less than and equal to 
k that Ps G d^f'^ and pt G d^f'^ are not connected. d!f^ is called a d-block of the 
d[k] -partition Similarly, brackets are used to represent a block of the d-partition. 

Different with c-partition, we do not care if two nodes in different blocks 
are not connected. We just require that the nodes in the same block should be 
connected except in some finite number of graphs. 

Definition 14. Let and d^'P'> be two c-partition or d-partition. We say < 

d^P'y if for each block G there exist a block d^^^ G d^P'> , such that C 




From the definition of c-partition and d-partition, we have the following 
property. 

Theorem 4. Given a set of graphs Gen{p), and two integer k < 1. Let be a 
c-partition of p, and d[k]^Pl and be d[k] -partition and d[l] -partition of p, 

respectively. Then 

( 1 ). 

( 2 ). < d[l]^Pl 

This theorem says that we have < ••• < Hence, c- 

partition is the upper bound of all d[fc]-partitions. We can use d-partition to 
approximate the optimal c-partition. 

Next, we give an algorithm to evaluate d-partitions. 
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Definition 15. Let and • • • , be two 

partitions of the magic predicate p. c^p^ x d^P^ = n d^^^i = = 

1, • • • , Z}, called product of two partition. 

Obviously, product c^p^ x d^P^ is still a partition of p, and x d^P^ < c^p\ 
c(p) X d^Pl < d^P\ 

Algorithm 3 (Constructing the d[k]-partition for magic predicates)^ 

Input: G{P), k 

Output: d[k] -partition of each magic predicate 
Method: 

(1) If k = 1, construct a partition d^Pl for p as follows. Let R G Gen(p)), C 
the set of no-magic predicates in R. Construct an initial partition 

for every rule R G Genfp). Each block consists of all positive nodes of a 
connected components in C . In this step, we do not consider the existence of 
magic predicate in the body. = ReGen(p){d[I]^P’^^) . 

(2) If k > 1, we first transform the program P and then construct the partition. 

(2.1) Construct a new hypergraph G{Q) = {G^{P),\J)~qR^ {P)) 

(2.2) Construct a d[l] -partition for p in the new hypergraph G{Q). Then, 
it is the d[k] -partition of p in the hypergraph G{P). 

Example 9. Consider the magic program in ExampleEl We construct d-partitions 
as follows. (1) = {[9i",9^]}, and = {[p^], [pj]}- (2) By doing graph 

transformation {G({P),R^{P) U R^{P)), and constructing d[l]-partition in the 
new graph, we get d(2]G) = {[g^, g+]}, d[2]W = {[p+ ,pj]}. Please note, al- 
though the two arguments of mg-p belong to different connected components 
in the second rule, they belong to the same block in the d[2]-partition. As the 
result, we have two single block d-partitions. Since d[2]^Pl < d[k]^P'^{k > 2), and 
d[2]^P'> is already a single-block partition, d[k]^P'> = d[2]^P\ {k > 2) . □ 

The following is a more complex example. It consists of two self-recursive 
rules. 

Example 10. K program is defined as follows: 

' ro : p(A, y, Z, U, V,W) : - e{X, Y, Z, U, F, W) . 
n ■.p{X,Y,Z,U,V,W) - ai{X,W)a2{Y,Xi)a3{Z,Yi)a4{U,Zi) 

< a5(C/i, Fi)a6(F, Wi)p(Ai, W, Zi, C/i, W, Wi). 

r 2 -.p{X,Y,Z,U,V,W) : - 5 i(A,y,Ai) 62 (Z,Zi,yi,C/i) 53 (C/,Fi), 

54(F, Wi, VF)p(Ai, Fi, Zi, C/i, Fi, Wi). 

The initial partitions for rule ri and V 2 are 

= {[pt.pt\[pt,Pf\[pt^P2\[pi^P(f\[pt^P6\[Pi^Pb\} 

d[l] = { [p+, p+ , Pi ] [p+ , P2 , P3 , P4 ] [P4 , Ps ] > Pe : Pe ]} 

^ In the following algorithm, we do not concern dangling nodes in I-graphs. They can 
be processed in the same way in 0. 
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Hence the d[l]-partition is 

d[l](p) = {[pt ,pt][pt][pt][pt][pt]} X {[Pt :Pt][pt][pt][pt ^Pt]} 

= {[ptMMMMM]} 

That is, it consists of a set of single node blocks. 

For d[2]-partition, after generating the 2-th I-graphs, we have their initial 
partitions as 

d[2](P^r,r,) ^ {[p+,p+][p+,p+][p+,p-][p+,p-][p-,p-][p^,p-]} 

d[2] = {[pt ,pt][pt ,pt , Pi] [pt : P 2 . P 3 > ^4 ] , Ps > Pe ] } 

d[2](P:’-2»'l) = {[p+,p+,p+,p+][p+,p]-,p^,p^][p|,pg][p-,p-]} 

d[2]iP^^2)T2 = {[p+ ,p+ ^p+ ,p- ^p- ^p- ,p- ^p-][,p+ ,p+ ,p+ ^p-]} 

After deleting negative nodes, and computing products of them, we obtain 

d[2](p) = {[ptMMMMM]} 

Similarly, we can get = {[Pi",P6 ][p^][Pa ][P 4 ][p^]}- The node pt and pg 

become connected in 

□ 



Theorem 5. The partition obtained by Algorithm^ is a d-partition. 

Proof. We first prove the case of k=l. Since the nodes are in a same block should 
be connected by a hyperedge in its I-graphs, they are connected in all generated 
I-graphs with higher order. Hence it is d[I]-partition. 

We then prove the case of k Comparing the graph (Gj {P) , {P)) with the 

graph {G{{P),ut~{R^ (P)), The different part are those graphs with order less 
than and equal k. The I-graph with an order < fc in {G\{P),R^{P)) has been 
transformed to an E-graph in {G^{P),vJ{ZoR^ {P)). Hence the nodes in a block 
of the generated d-partition are connected each other in all I-graphs with order 
great then k. Only in those I-graphs with order < k, these nodes may belong to 
different blocks. □ 

By comparing the two algorithms, it is easy to see that Sippu and Soisalon- 
Soininen’s factoring algorithm is exactly our d[l]-partition, except processing of 
dangling nodes in I-graphs. In the current version of the d-partition algorithm, 
we do not concern the processing of dangling nodes, that is, those nodes corre- 
sponding to the variables in the magic predicates in the head that do not appear 
in any EDB predicates in the body of the rule. Hence, we have the following 
result 

Theorem 6. Let be the partition for a magic predicate p that obtained by 
Sippu and Soisalon-Soininen’s factoring algorithm^ and d^^^ be a d-partition. 
Then 
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It means our algorithm can generate a better splitting schema. It thus solves 
the “over-splitting” problem partially. In the following, we give out the revised 
factoring algorithm in our graph model notation. It can also process dangling 
nodes. In order to capture more information, the a-labeled nodes should be 
attached with v-labels also. 



Algorithm 4 .■ (Revised Factoring) 

Input: G{P),k>l 
Output: 

Method: 

(1) . The initial partition d^^^ is a one-block partition. That is, it consists of all 
arguments of magic predicate p. 

(2) . Constructing G{Q) = {G’‘{P),UjlQR^ (P)) from G{P). 

(3) . For each I-graph R G G{Q), which positive nodes are marked by p~^ and neg- 

ative nodes are marked by q~ , let C be the set of the non-magic subgoals of R, 
which is partitioned into a set of connected components, say Ci, • • • , Cm,m > 
1. Replace d^^^ by d^P^ x d^P’^\ where d^P'^'> = is 

the d[l] -partition of R, consists exactly of those arguments which ap- 
pear in some subgoal in Ci. = d^P'> — 

(4) . If graph R contains two dangling positive nodes and their corresponding 
dangling negative nodes (that is, they have the same v-labels) belong to dif- 
ferent blocks in then the d^P'> must be replaced by d^P^ x d^'^\ where d^‘^^ 
is obtained from d^'^^ as follows. 

— remain those blocks that consists of exactly dangling nodes. 

— combine all the other blocks that containing no dangling node. 

This step should be repeated until no change occurs. □ 



5 Conclusions and Further Research 

The concepts of factorization proposed firstly by Sagivj^, and revised by Sippu 
and Soisalon-Soininen [7] . Its primary purpose is to guarantee that the magic-set 
algorithm does not generate big temporary relations. A common assumption is 
that a predicate defined by a set of EDB predicates which are connected each 
other has a reasonable size. However, although their strategy can guarantee 
that the generated magic predicates are defined by single-connected-component 
rules, it suffers from the “over-splitting” problem. That is, some original magic 
predicates that do not produce bid relations will be split. In this paper, we 
focused on the problem: what is the best splitting schema for a recursively- 
defined magic predicate under the same assumption? We proposed the concept 
of c-partition as the solution. Unfortunately, we have not found an appropriate 
algorithm to construct this best splitting schema. As an approximate solution, 
we proposed d[/c]-partition concept, and proved that d[/c]-partition is better than 
d[^]-partition if fc > L We also proved that all d-partitions are better than Sippu 
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and Soisalon-Soininen’s factoring. All of these results are based on a proposed 
hypergraph model to represent a magic program as well as its naive evaluation 
procedure. 

There are still some open problems. At first, In the current version of the 
algorithm of constructing d-partition, we generating higer order I-graphs explic- 
itly. From Example M and Example uni we see that it is possible to generate 
d-partition directly from the pattern of initial partitions. Hence, developing a 
more efficient algorithm to construct d-partition is still interesting. Next, we 
know that < d[2]^P^ < ■ • • < where d[k]^P^ is d[fc]-partition and is 

c-partition. That means is an upper bound of d-partitions. We further hope 
that the following proposition is true. 

Proposition 1. For a given program P, there exist a finite integer k such that 
the d[k] -partition is exactly the c-partition. 

Moreover, the existence of an efficient algorithm to construct c-partition directly 
is also interesting. 
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Abstract. In the literature, there are quite a few sequential and parallel 
algorithms for solving problems on distance-hereditary graphs. With an 
n-vertex distance-hereditary graph G, we show that the perfect domi- 
nating set problem on G can be solved in 0(log^ n) time using 0{n + m) 
procesors on a CREW PRAM. 



1 Introduction 

A graph is distance-hereditary laiHl if the distance stays the same between any 
of two vertices in every connected induced subgraph containing both (where 
the distance between two vertices is the length of a shortest path connecting 
them). Distance-hereditary graphs form a subclass of perfect graphs |1 Ifl 4|1 dj 
that are graphs G in which the maximum clique size equals the chromatic number 
for every induced subgraph of G [B|. Two well-known classes of graphs, trees 
and cographs, both belong to distance-hereditary graphs. Properties of distance- 
hereditary graphs are studied by many researchers 

which resulted in sequential or parallel algorithms to solve quite a few 
interesting graph-theoretical problems on this special class of graphs. 

Previous parallel algorithms designing on distance hereditary graphs are 
briefly summaried as follows. In mg, Dahlhaus presented an algorithm to recog- 
nize the distance-hereditary graph in 0 (log^ n) time using 0{n m) processors 
on a CREW PRAM, where n and m are the number of vertices and edges of 
a given graph. In |2|, all-to-all vertices distance of a distance-hereditary graph 
were computed. In PI. Hsieh et al. presented algorithms to And a minimum 
weighted connected dominating set and a minimum weighted Steiner tree of 
a distance-hereditary graph in O(logn) time using 0(ji -\- m) processors on a 
CRCW PRAM. A minimum connected 7 -dominating set and a 7 -dominating 
clique of a distance-hereditary graph can be found in O(lognloglogn) time us- 
ing 0{{n m)/ log log n) processors on a CRCW PRAM fZ I ) . In PI. Hsieh 

et al. solved several subgraph optimization problems including the maximum 
independent set problem, the maximum clique problem, the vertex connectivity 
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problem, the edge connectivity problem, the dominating set problem and the in- 
dependent dominating set problem in 0(log^ n) time using 0{n + m) processors 
on a CREW PRAM. 

Given a simple graph G = (V,E), a vertex u S R is said to dominate itself 
and all vertices u € V adjacent to v. A perfect dominating set of C? is a subset 
D oiV such that every vertex in R \ Z? is dominated by exactly one vertex of 
D. The perfect dominating set problem is to find a perfect dominating set of G 
with the minimum cardinality m- In this paper, we show the perfect domi- 
nating problem can be solved in linear time on a distance-hereditary graph G. 
We also parallelize the sequential algorithm in 0 (log^ n) time using 0{n + m) 
processors on a CREW PRAM, where n and m are the number of vertices and 
edges of G. If G is represented by a decomposition tree form, the perfect dom- 
inating set problem can be optimally solved in O(logn) time using 0(n/ log n) 
processors on an EREW PRAM. The complexities of the domination problem 
and its variants, including (a) the dominating set problem, (b) the independent 
dominating set problem, (c) the connected dominating set problem, (d) the con- 
nected 7-dominating set problem, and (e) the perfect dominating set problem 
on distance-hereditary graphs are summaried in Table 1 . 



Problem 


Sequential 


Parallel 






Time 


Processors 


Model 


(a) 


0(n -1- m)[S] 


0(log^ n) 


0(n -1- m) 


CREWHH 


(b) 


0 (n -I- m)|S] 


0(log^ n) 


0(n -1- m) 


CREW 1221 


(c) 


0(n -1- m)0 


0(log n) 


0(n -1- m) 


CRCWd 


(d) 


0{n -1- m)0 


0(lognloglog n) 


0{{n + m)/ log log n) 


CRCWIU 


(e) 


0(n -1- m) 


0(log^ n) 


0(n -1- m) 


CREW 



Table 1. The complexities of various domination problems. 



The computation model used here is the deterministic parallel random access 
machine (PRAM) which permits concurrent read and exclusive write (CREW), 
or exclusive read and write (EREW) in its shared memory. The rest of this paper 
is organized as follows. In SectionEl we review some previously known properties 
of distance-hereditary graphs and give some basic definitions. In Section 0 we 
first show the perfect domination problem can be recursively solved in 0(n + m) 
time on a distance-hereditary graph represented by its binary tree form, called 
a decomposition tree. We then show that our recursive scheme can be efficiently 
parallelized using the tree contraction technique. Conclusion is given in Section^ 

2 Preliminaries 

This paper considers finite, simple, undirected graphs G = (R, E), where R and 
E are the vertex and edge sets of G, respectively. Let n = |R| and m = \E\. For 



An Optimal Parallel Algorithm for the Perfect Dominating Set Problem 115 



standard graph-theoretic terminologies and notations, see ^ 21 - Let G\X] denote 
the subgraph of G induced hy X CV. The union of two graphs Gi = {V\,Ei) 
and G2 = (V2, S2) is the graph G = (hd U V2, Ei U i?2)- 

The class of distance-hereditary graphs can be defined by the following re- 
cursive definition. 

Definition 1. ( 1 ) A graph consisting of a single vertex u is a primitive 

distance-hereditary graph with the twin set {u}. 

( 2 ) If Gi and G2 are distance-hereditary graphs with the twin sets Si and S2, 
respectively, then the union of Gi and G2 is a distance-hereditary graph G with 
the twin set S'! U S'2- In this case, we say G is formed from Gi and G2 by the 
false twin operation ©. 

( 3 ) If Gi and G2 are distance-hereditary graphs with the twin sets Si and S'2, 
respectively, then the graph G obtained from Gi and G2 by connecting every 
vertex of Si to all vertices of S2 is a distance-hereditary graph with the twin 
set Si U S2. In this case, we say G is formed from Gi and G2 by the true twin 
operation ®. 

( 4 ) If Gi and G2 are distance-hereditary graphs with the twin sets Si and S2, 
respectively, the graph G obtained from Gi and G2 by connecting every vertex 
of Si to all vertices of S2 is a distance-hereditary graph with the twin set Si . In 
this case, we say G is formed from Gi and G2 by the attachment operation 0 . 

The above definition implies that a distance-hereditary graph can be repre- 
sented by a binary tree form, called a decomposition tree, which is defined as 
follows. 

Definition 2. ( 1 ) The tree consisting of a single vertex labeled u is a decompo- 
sition tree of the primitive distance-hereditary graph G = ({u}, 0 ). 

( 2 ) Let T>i and T >2 be the decomposition trees of distance-hereditary graphs Gi 
and G2, respectively. 

(a) If G is a distance-hereditary graph formed from Gi and G2 by the true twin 
operation, then a tree T> with the root r represented by 0 and with the roots of 
T>i and T >2 being the two children of r is a decomposition tree of G. 

(b) If G is a distance-hereditary graph formed from Gi and G2 by the attach- 
ment operation, then a tree T> with the root r represented by 0 and with the 
root of T>i (respectively, V2) being the right (respectively, left) child of r is a 
decomposition tree of G. 

(c) If G is a distance-hereditary graph formed from Gi and G2 by the false twin 
operation, then a tree T> with the root r represented by © and with the roots of 
T>i and T>2 being the two children of r is a decomposition tree of G. 

Figure 1 illustrates a decomposition tree of a distance-hereditary graph. 

The sequential and parallel complexities for constructing a decomposition 
tree are given below. 

Lemma 1. A decomposition tree of a distance-hereditary graph can be 

constructed in sequential 0(n + m) time, and in parallel 0(log^ n) time using 
0(n 0 m) processors on a CREW PRAM. 
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Fig. 1. A distance-hereditary graph G with its decomposition tree Dq. 



For a vertex a; in a decomposition tree Dq, let Dg{x) be the subtree rooted 
at X and let Gx be the subgraph of G induced by the leaves of Dg{x). Also let 
Sx be the twin set of Gx and let Vx = V{Gx)- For a node u in a binary tree T, 
let child{v), sib{v) and par{v) denote the children, the sibling and the parent of 
u, respectively. 



3 The Perfect Dominating Set Problem 

3.1 Recursive Formulas 

Let G = (F, E) be a distance-hereditary graph and S be its twin set. Let D{G) 
denote a minimum perfect dominating set of G. An S-type perfect dominating set 
of G is a perfect dominating set of G which contains at least one vertex of S. An 
S-type perfect dominating set of G is a perfect dominating set of G which contains 
no vertex of S. A minimum S-type (respectively, S-type) perfect dominating set of 
G, denoted by Ds{G) (respectively, D-g{G)), is an S'-type (respectively, S'-type) 
perfect dominating set of G with the minimum cardinality. A perfect dominating 
set Q of G\V \ S'] is called H-type if Q C (V\S) and no vertex in Q is adjacent to 
any vertex of S. Let H{G) denote an H-type perfect dominating set of G\V \ S] 
with the minimum cardinality. 

We say that two disjoint vertex subsets X and Y of V form a join in a graph 
G = (F, E) if every vertex of X is adjacent to all vertices of Y. For k (> 1) sets 
Si, S 2 , . . . , Sfc, the min operator on {Si| 1 < z < fc} is used to select a set Sj 
such that |Sj| = mzn{|Si|| 1 < z < A:}. 

Lemma 2. Suppose G is a distance-hereditary graph formed from G\ and G 2 
with the twin sets Si and S 2 , respectively, by the true twin operation and S is 
the twin set of G. Then, 

(1) D{G) = min{Ds{G),Dg{G)}, 

(2) Ds{G)=min{DsAGi)^H{G2),H{Gi)\JDs,{G2)}, 

(3) Dg{G) = il^(Gi) U D^{G2), 

(4) H{G) = H{Gi)UH{G2). 
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Proof. We first show (1) holds. By definition, Si and S2 form a join and the 
twin set S is the union of Si and S2- Besides, all vertices in V{Gi) \ Si are not 
adjacent to any vertex in V{G2) \ S2- Suppose D is & perfect dominating set of 
G. Let Di = D \SV (Gi) and D2 = D \SV (G2). There are two cases. 

Case 1. D does not contain any vertex of S. Thus D is an S'-type perfect 
dominating set. 

Case 2. D contains at least a vertex of S. Obviously, D is an S-type perfect 
dominating set of G. 

On the other hand, if D' is an S-type perfect dominating set of G or an S'-type 
perfect domionating set of G, then D' is also a perfect dominating set of G. 
Therefore, (1) holds. 

We next show (2) holds. Suppose D is an S-type perfect dominating set of G 
and let Z?i = D n y (Gi) and D2 = D r\V (G2). By the definition of the perfect 
domination problem, it is impossible that D C Si 0 and D C\ S2 yf 0. Thus we 
consider the following two cases. 

Case 1. D does not contain any vertex in Si. By definition, D contains at least 
one vertex of S2. Since no vertex of V (G2) is adjacent to any vertex of V (Gi)\Si, 
Di is an H-type perfect dominating set of G[y(Gi) \ Si] and D2 is an S-type 
perfect dominating set of G2. 

Case 2. D does not contain any vertex in S2. By arguments similar to show the 
above case, Di is an S-type perfect dominating set of Gi and D2 is an H-type 
perfect dominating set of G[y(G2) \ S2]. 

On the other hand, if 

(a) P is an H-type perfect dominating set of G[y(Gi) \ Si] and Q is an S-type 
perfect dominating set of G2 or 

(b) P is an S-type perfect dominating set of Gi and Q is an H-type perfect 
dominating set of G[y(G2) \ S2], 

then PU Q is an S-type perfect dominating set of G. From the above discussion, 
(2) holds. Formulas (3) and (4) can be shown similarly. □ 

Lemma 3. Suppose G is a distance-hereditary graph formed from Gi and G 2 
with the twin sets Si and S2, respectively, by the attachment operation and S is 
the twin set of G. Then, 

(1) D{G)=min{D^G),Ds{G)}, 

(2) Ds{G)=DsAGi)^H{G2), 

(3) Dg{G) = min{H{Gi) U Ds,{G2), D^{Gi) U D^{G2)}, 

(4) H(G) = H(Gi)UP^(G 2). 

Proof. By definition, S = Si. Besides, Si and S2 form a join in G. Clearly, (1) 
and (2) hold. We next show (3) holds. Suppose D is an S-type perfect dominating 
set of G and let Di = D f] V{Gi) and D2 = D C\ V{G2). If P C S2 = 0, 
then P n Si = 0 and P C S2 = 0. Therefore, Pi (respectively, P2) is an Si- 
type (S2-type) perfect dominating set of Gi (respectively, G2). If P C S2 yf 0, 
then P2 is an S-type perfect dominating set of G2 and Pi is clearly an H-type 
perfect dominating set of G[C(Gi) \ Si]. Conversely, if P' = P U Q satisfying 
that either (1) P (respectively, Q) is an Si-type (respectively, S2-type) perfect 
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dominating sets or (2) P is an H-type perfect dominating set of G\V{Gi) \ 5i] 
and Q is an S-type perfect dominating set of G 2 , respectively, then D' is a 
perfect dominating set of G. Thus (3) holds. Formula (4) can be shown from the 
structural characteristics of G. □ 

Since no vertex of G\ is adjacent to any vertex of G 2 if G is formed from G\ 
and G 2 by the false twin operation, the following lemma can be obtained. 

Lemma 4. Suppose that G is a distance-hereditary graph formed from G\ and 
G 2 with the twin sets Si and S 2 , respectively, by the false twin operation, and S 
is the twin set of G. Then, 

(1) D{G) = D{Gi)U D{G2), 

(2) Ds(G) = min{Ds,{Gi) U D{G 2 ), D(Gi) U Ds,{G 2 )}, 

(3) Dg{G) = D^{Gi) U D^{G2), 

(4) H{G) = H{Gi)UH{G2). 

According to the recursive formulas generated in lemmas |2F0 we have the 
following theorem. 

Theorem 1. The perfect dominating set problem on a distance-hereditary graph 
G can be solved in 0{n -\- m) time. 

Proof. By Lemma ^ a decomposition tree Dq for the given distance-hereditary 
graph G can be constructed in 0{n -h m) time. By Lemmas EH D{G) can 
be found using the dynamic programming technique and postorder traversal of 
V{Dg) = 0{n). □ 

3.2 Parallel Implementation 

In this section, we show the sequential algorithm described in the previous sec- 
tion can be optimally parallelized using the binary tree contraction technique 
described in P . 

The binary tree contraction technique can be regarded as a general method for 
scheduling parallel computations on trees and has been applied to some problems 
on graphs and to evaluation of binary arithmetic computation trees . This 
technique recursively applies two operations, prune and bypass, to a given binary 
tree. Prune(u) is an operation which removes a leaf node u from the current 
tree, and bypass{v) is an operation (following a prune operation) that removes 
a node v with exactly one child w and then lets the parent of v become the new 
parent of w. Figure 2 shows two procedures prune{u) and bypass{v). 

The algorithm initially numbers the leaves in a left to right order and then 
repeat the following steps. In each step, prune and bypass work only on the leaves 
with odd index and their parents. Hence, these two operations can be performed 
independently and delete leaves together with their parents on the binary 
tree in each step, where I is the number of the current leaves. Therefore, the tree 
will be reduced to a three-node tree after repeating the steps in [logn] times, 
where n is the number of leaves of an input tree. 
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Fig. 2. Two procedures prune{u) and bypass{v). 



Lemma 5. m If the prune operation and bypass operation can be performed 



by one processor in constant time, the binary tree contraction algorithm can be 
implemented in 0(log n) time using 0(nj log n) processors on an EREW PRAM, 
where n is the number of nodes in an input binary tree. 

Definition 3. Let G be a distance-hereditary graph and let Dq be a decompo- 
sition tree. The closed system on Dq is defined as follows. Initially, each leaf I 
of Dq (representing a primitive distance-hereditary graph Gi ) is associated with 
D{Gi), Dsi{Gi), D-g^{Gi) and H{Gi). There are a constant number of min and 
U operators associated with each internal node or © or 0 ) of Dq- Let C^f. G 
{D{Gu),Ds„{Gu),D^{Gu),H{Gu)} and G^^ G {D(G^),Ds„(G^),D^(G’^), 
H{Gw)}. For each internal node v with the left and right children u and w, 
respectively, the solutions of D{Gv), Ds„{Gv), D—{Gv) and H{Gv) can be ob- 
tained by the following rule: 



where p,q,r,s are all constants. Let 7 be the root of Dq- The goal is to find 



In the following, we show the binary tree contraction can be used to op- 
timally solve closed system. During the process of tree contraction, we con- 
struct four functions associated with each node v G V{Dg). Let Xi,X 2 ,X^ 
and AI 4 are indeterminate that stand for the possibly unknown solutions of 
D{Gv),Ds„{Gv),D—{Gv) and H{Gy). The functions associated with v possess 
the following form: 



D(G„) = mm{G“iUG 
DsAGv) = min{Cl.yA G 
D^{G„) = min{Cl.y U C 
H{Gy) = min{Gfii U G 




D{G.y) = D{G). 



□ 
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where yj G {1,2, 3, 4}, Xy^ ^ Xy^ and each ay- is some vertex subset of for 
all y in {a, 6, c, d\. 

We call the above functions goal functions and their common form as closed 
form. In the execution of the tree contraction, let v' be the child of par{v) (in 
the original tree Dq) which is a ancestor of f or n itself. We let the functions 
and Ff to represent D{Gvi),Ds^, {Gy'), D—{Gy>) and H{Gy'), re- 
spectively. 

During the process of executing the tree contraction, some nodes are removed 
and the functions of the remaining nodes are adjusted such that the functions 
remain closed and the following invariant is maintained. 

Invariant (A): Let v be an internal node of the current tree such that v holds 
min and U operators which generate the solutions of D{Gy), Ds„ (Gy), D— (Gy), 
and H{Gy) based on Equations (l)-(4). Let u (respectively, w) be the left (re- 
spectively, right) child of v in the current tree whose associated functions are 
E“(Ai,A 2, A3,A4) (respectively, F/"(Ai, A2, A3, A4)) for all 1 < t < 4. For 
each G^i G {D(Gu), F>s,^(Gu), D-^(Gu), H(Gu)} (respectively, CJf i G {D(Gy,), 
Ds„,{Gw),F>—{Gyj),H{Gyj)} ) in Equations (l)-(4), we replace it with FJf^ G 

{Ff,F:f,F^,F^} (respectively, FJfi G {Ff’ , F^ , F^ , Ff’}) that represent G^i 
(respectively, C]fi), where u' (respectively, w') is the child of v in the original 
Dg which is an ancestor of u (respectively, w) or u (respectively, w) itself. Then, 















D{Gy)=min{Ff,{D{Gu),DsAGu),D—{Gu),H{Gu))UF^,{D{Gy,), 
Ds„,{Gy,),D—{Gy,),H{Gy,)),Ff^{D{Gy), DsSGu),D—{Gy), H{Gy)) U 
Ff’^{D{Gy,),Ds^{Gy,),D—{Gy,),H{Gy,)),...,Ff^{D{Gu),DsSGu), 
D—{Gu),H{Gu)) U Ff’y{D{Gy,),Ds^{Gy,), D—{Gy,), H{Gy,))}. 

DsliGy) = mm{F^^{b{Gy),DsAGu),D—{Gy),H{Gy)) U F^,{D{Gy,), 
Ds^{G,y),D—{Gy,), H{Gy,)),F^^{D{Gy), DsSGu),D—{Gy), H{Gy)) U 
F^^{D{Gy,),bsbGu,),D—{Gy,),H{Gy,)),...,F^^{D{Gu),DsAGu), 
D—{Gy),H{Gy)) U F^^{D{Gy,),DsbGn'),D—{Gy,), H{Gy,))}. 

D^{Gy) = min{F)f^{D{Gu),DsbGu),D—{GZ),H{Gu)) U F^^{D{Gy,), 
DsZ{Gy,),D—{Gy,), H{Gy,)),F^^{D{Gy), DsAGu),D—{Gu), H{Gy)) U 
F 3 - (D(G„),DsJG„),D— (G^),iL(G^)),...,F 3 \(D(G„),Ds„(G„), 
D—{Gy),H{Gy)) U F^^{D{Gy,),DsJGy,),D—{Gy,), H{Gy,))}. 

H{Gy) = min{F^^{D(Gu),DsbGu),D—{Gu)ZH{Gu)) U Ff’^{D{Gy,), 
Ds^{Gy,),D—{Gy,),H{Gy,)),F^.,{D{Gu), DsSGu),D—{Gu), iL(G„)) U 
Ff’^{D{Gy,),bsbGw),D—{Gy,), H{Gy ,)), . . . , F^^{D{Gy),DsbGu), 
D—{Gu),H{Gy)) U Ff’,{D{Gy,), Ds„(G„), D— (G„), i4(G„))}. 



Initially, we define the functions for a node v G Dg to be 



□ 



(i) E('(Ai,A 2 ,A 3 ,A 4 ) =mzn{AiU0}, 

(ii) F2 ^(Ai, A2, A3, A4) = min{X 2 U 0}, 

(iii) F«(Ai, A2, A3, A4) = min{X^ U 0}, 

(iv) E{'(Ai, A2, A3, A4) = mm{A4 U 0}. 



The Invariant (A) holds trivially in this case. We use the tree contraction al- 
gorithm to reduce the given input tree Dg into a three-node tree T' such that 
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solving closed system on Dq is equivalent to solving closed system on T'. We 
augment the prune and bypass operations by constructing corresponding func- 
tions so as to maintain the invariant (A). 

In the execution of the tree contraction, assume prune{u) and bypass{par{u)) 
are performed consecutively. Let par{u) = v and sib{u) = w. Assume u is the left 
child of V (the case of u being the right child can be similarly computed). Note 
that u and v are removed; hence, their contributions to the four target solutions 
computed at node par{v) have to be incorporated into the functions associated 
with w. Assume, without loss of generality, that v contains the operators min 
and U which perform Equations (l)-( 4 ). The four target solutions associated 
with node v is given by 

• D{G^)=min{F^^{D{Gu\DsSGu\D—{Gu\H{Gu))\JFr^{Xi,X 2 ,X:i,Xi), 
F^^{D{Gu),DsAGu),D—{Gu).H{Gu)) U Ef 2 (^i, ^ 2 , ifs, ^4), . . . , 
Flp{D{Gu), Ds^ (G„), Z?£(G„), H{Gu)) U X2, X3, X4))} = 

min{QiUmin{XaiLlaai, Xa 2 ^aa 2 , ■ ■ ■ ,Xa^Uaa^}, Q 2 Umin{Xb^Uabi, Xt^U 

} • ■ • j ^bs ^bs j ■ ■ ■ 5 Qp b otci j X^^ b cTc2 ? • ■ ■ ? ^ct b }} — 

mm{mm{Aaib(QiUaai), Aa2b(QiUaa2)) • ■ • , Xa^\J{Qi\Jaa^)} ,min{Xb^\J 
{Q2 b abi),Xb2 b {Q2 U ab^),---, Xb, b (Q2 b OhJ}, . . . , min{Xc^ U {Qp U 
UcQ,Xc2 b {Qp U «C2)) • ■ • ) Xct b {Qp U ttcj}, } (distributive law) 

= min{min{Xa^ U a'ai, ^^2 b a'a^,- ■ - ,Xa^ U a' a^},min{Xb^ b a'bi,Xb2 b 

0 ^2 j • ■ • 5 Xbg b O } , . . . , A^d b O d , Xq 2 b CX C 2 7 } Xci b O Ci } , } — 

Tn%n{X( 2 -^ U o j Xa 2 bo ^2 ? • • • ? X^^ b o a,. ? Xb^ b o , Xb 2 bo ^2 ? ■ ■ • ? Xb^ b 
a'b ^ , . ■ • , Ad b o'd , Ac 2 b o'c2 j • ■ • ) Xct b o'd } (associative law) 

= min{Xd^ U ((bj-ja'a, | aj = di}) U {Ujia'b^ \ bj = di}) . . . , {Uj{a'c. \ Cj = 
di})),Xd 2 b {{Ujla'a^l a,j = ^2}) b (bjjo'bJ bj = ^2}) ■ • ■ , (bi{o'd | Cj = 

d2})), ...Xd,U {{U,{a'a, I a, = 4 }) b (b,{o',,. | b, = 4 }) ■ • ■ , (b,{o'd | c, = 

4}))} (communicative and distributive laws) 

= min{Xd^ \J Pdi,Xd2 U fdd2, ■ ■ ■ , Xd^ b/Jd*,} (associative law), where Aj, 

1 < z < 4 , are unknown target solutions of node w, assuming that the 
invariant (A) holds before the prune{u) and bypass{v) operation. 

• Ds^G, 7 )=min{F^^{D{Gu),DsAGu),D—{Gu),H{Gu))UF^,{Xi,X 2 ,X 3 , 

Xi),F^^{D{Gu), Ds^ (G„), Z?— (G„), iZ(G"„)) U F^^{Xi,X2, A3, X4), 
Flq{D{Gu)7 D.s^ {Gu),D^{Gu), H{Gu)) b E2™,(Ai, A2, A3, A4} = mzn{Ad 
b jdei , Ae2 b / 3 g 2 , ■ ■ ■ , Xei b / 3 ei }• 

• ZZ— (G„) =mzn{E 34 (ZZ(G„),iZs„(G„),ZZ— (G„),/Z(G„))UF 34 (Ai,A 2 ,A 3 , 

aI), Z^3“2(ZZ(G„), z 3 s„ (G„), ZZ— (G„), ZZ(G"„)) U F34(Ai, A2, A3, A4), . . . , 
F3“4ZZ(G„),ZZs4G„),ZZ^(G„),ZZ(G„))UF3- (Ai,A2,A3,A4)} = min{Xf, 
b / 3 /i , Xf 2 U / 3 / 2 , . . . , Xf^ U / 3 /^ }. 

• ZZ(G„) =mzn{E7i(ZZ(G„),ZZs„(G„),ZZ—(G„),ZZ(G„))UE7i(Ai,A2, A3, 

A4), T^2(^(G„), ZZs„ (G„), ZZ— (G„), ZZ(G„)) U F44(Ai, A2, A3, A4), . . . , 
F4“,(ZZ(G„),ZZs„(G„),ZZ— (G„),ZZ(G„))UE’ 4™ (Ai, A2, A3, A4)} = min{Xg, 

U/ 3 d ,^92 b/?g2,---,^g. U 4 J. 

Hence, the contribution of four target solutions to the node par{v) is given 

by 
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(i) F^{Xi,X 2 , Xs, X 4 ) = F^{D{G^),Ds^ (G„), G^(G„), i?(G„)) = min{Xh, U 
ck/ii , Xh 2 U ah 2 , ■ ■ ■ , Xhk U a/ifc } (the process of simplifying is similar to sim- 
plify the function corresponding to P^{Gy) describe in last paragraph). The 
following functions can be further simplified as closed form similar to (i). 

(ii) F2 “(Xi,X 2,X3,X4) =F2fyG(G„),GsfyG„),G— (Gfy,iJ(Gfy). 

(iii) F3 “(Xi,X2,X3,X4) =F 3 fyG(G„),L»s„(G,),Gf(Gfy,i/(Gfy). 

(iv) fyr(Xi,X2,X3,X4) = FI{D{G,),DsSGv),D^{G^),H{G,)). 

The invariant (A) is then clearly maintained. 

Our algorithm for solving closed system consists of an initial assignment 
of four functions to each node of Dq, and an application of the tree con- 
traction algorithm such that prune{u) and bypass{par{u)) operations are aug- 
mented as specified in the previous paragraph. Once the tree contraction algo- 
rithm terminates, we have a three-node tree T' with a root 7 holding a con- 
stant number of min and U operators and two leaves u and v. According to 
the functions associated with u and v, D{Gu), Ds^{Gu), D—{Gu), H{Gu) and 
D{Gv), Ds^{Gv), D—{Gv), H{Gy) can be obtained. According to the operators 
associated with 7, a solution D{Gr) = D{G) can be found. 

We now discuss the complexities of the above implementation. 

Lemma 6. The functions described in executing prune{u) and bypass{par{u)) 
can he constructed in 0(1) time using one processor. 

Proof. By the definition of the closed form and the process of the goal function 
composition. □ 

By Lemmas El and 0 we have the following result . 

Lemma 7. The closed system can be implemented in O(logn) time using 0{n/ 
logn) processors on an FREW PRAM. 

Since solving the perfect dominating set problem using a given decomposition 
tree can be reduced to the problem of solving the closed system, we have the 
following theorem. 

Theorem 2. Given a decomposition tree of a distance-hereditary graph G, the 
perfect dominating set problem on G can be solved in O(logn) time using 
0(n/logn) processors on an EREW PRAM. 

4 Conclusion 

In this paper, we have solved the perfect domination problem on distance- 
hereditary graphs in 0(log^ n) time using 0{n -I- m) processors on a CREW 
PRAM. It leads to the result that the perfect domination problem on distance- 
hereditary graphs belongs to NC. The bottleneck of our algorithm is the de- 
composition tree construction. If such a tree can be optimally constructed on 
an EREW PRAM, the complexities of our algorithm can be further reduced to 
O(logn) time using 0{n/ logn) processors on an EREW PRAM. 
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The technique we utilize is the binary tree contraction technique. Previous 
parallel algorithm by using the tree contraction to solve a subgraph optimization 
problem consists of two phases. The first phase is to compute the value used to 
measure the weight of a target subgraph. The second phase is to actually find a 
target subgraph according to the information gathered in the first phase. In this 
paper, we develop a one-phase tree contraction scheme based on the properties 
of the tree contraction and the given problem. We hope the technique can be 
applied to more subgraph optimization problems on those graphs which are tree- 
representable. 
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Abstract. We propose a taxonomy of preemptive [suspensive and abort- 
ive) operators capturing the intuition of such operators that exist in the 
various synchronous languages. Some of the main contributions of the 
paper are: a precise notion of preemption is established at a structural 
level ; we show that the class of suspensive operators is strictly more 
expressive than abortive operators, and we show that suspension is prim- 
itive while abortion is not. 

The proof techniques relies on a syntactic approach, based on SOS- 
specification formats, to categorize the different preemption features ; 
also an equivalence criterion between operators specifications is proposed 
to provide us with expressive power measurement. 



1 Introduction 

Process preemption is the notion of controlling the life and death of an activity. 
Various forms of preemption such as interrupts have been in use in the context 
of operating systems for a long time. Well-known occurrences are suspension and 
abortion of processes (also known as, in the case of the Unix operating system, 
respectively ~z and ~c). However, it is only recently that the vital role of pre- 
emption operators is coming to light in the context of synchronous programming 
languages. Berry has argued for the need and importance of process pre- 

emption operators for programming reactive and real-time systems that have 
inherently a time-dependent model. A variety of preemption operators can be 
seen in synchronous languages or formalisms such as Argos, Esterel, Lus- 
tre, Statecharts, Signal etc, (Some of these operators are described 

in Example El of this paper) . 

In spite of the wide usage of the notion of preemption, there have been no 
attempts in characterizing these operators at structural level or transformational 
level. The advantage of such a characterization is that it is possible to look at 
existing notions like expressiveness or completeness in the context of synchronous 
calculi such as Meije |bim85j and SCCS jMil83j . this time, in the context of 

* Work supported by IFCPAR (Indo-French Center for the Promotion of Advanced 
Research), New Delhi. 
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preemption. Such a characterization will also make it possible to characterize 
classes of operators with good behaviors based on syntactic formats such as GSOS 
EM, tyft CTO etc. Furthermore such frameworks would also lead to the 
study of canonical models from the operational models. 

In this paper, we provide a characterization of preemption operators within 
such a spirit. First, we propose an SOS-specification of preemption operators 
through a sub-fragment of the tyft format mm called the xyfg/xyfz format. 
This framework leads to the natural categorization of the suspensive and abortive 
operators capturing the intuition of such operators that exist in the various 
synchronous languages. Some of the main contributions of the paper are: 

— A precise notion of preemption at a structural level is established. 

— It is shown that the class of suspensive operators is strictly more expressive 
than abortive operators, and 

— From the class of operators we consider (see Section 1??^ it is further estab- 
lished that suspension is primitive while abortion is not. 

The paper is organized as follows : Section 12. 1 1 introduces the framework of 
SOS-specifications for describing operators and proposes a criterion of equiva- 
lence between operators (and specifications). Classes of preemption operators 
will be described in so-called xyfg/xyfz specifications, introduced in Section I2~2I 
In Section |21 the taxonomy of preemption operators is presented ; it uses syn- 
tactic criteria over the specification rules of the operators. Finally, Section 0 
answers the expressiveness issues and Section 0 concludes this work. 

2 Preliminary Notions 

2.1 General Notions 

Notation: We use set V to denote a denumerable set of term variables, whose 
typical elements denoted by v,v\,V 2 --- and X to denote a denumerable set of 
meta-variables, whose typical elements are denoted by x,xi, ...,y,yi.... 

The distinction between meta- variables and variables is made for pedagogical 
purposes: met a- variables like x,x\,... will be used to describe clauses of the 
operational rules schemes, whereas variables v,v\,... will be used to refer to 
contexts as open terms of the language. 

Definition 1. (Signature) A signature is a pair E = {F, V) where F = 
{/, g, ...} is a denumerable set o/ function names disjoint from V U X, and 
V : F ^ N is the arity function. We write (f,n) G S to express that f G F and 
V(/)=n. 

T(SUV U A) will denote the set of terms over S that might possibly eontain 
variables in V \J X , and will be ranged over by T,Ti,T 2 , ..., S,U, .... For T G 
T{S U R U X), V ar{T) C V U X denotes the set of variables oecurring in T . If 
Var(T) / 0 then T is ealled an open term (over X); otherwise it is a closed term. 
T(E) will denote the set of terms over E, with typical elements t,t' ,t\,t 2 , ■ ■ •. 
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The height of a term T £ T{E U ^ U X), written \T\, is defined by induc- 
tion over the structure of T by |T| is 1 if T is a variable, otherwise T is some 
/(Ti, then |T| is 1 + maa;{0, |Ti|, |T„|} 

A substitution is any partial function VUX — > T{EUVUX). Typical elements 
for substitutions will be written a, a' , p, .... Write Dom{a) the domain of partial 
functions a, and va the image of variable v by a. 

Any substitution a can be extended to the domain of open terms over E by 
taking Ta as the term obtained by replacing in T any v € Dom{a) by va. 

Classically, given two open terms T,T' over E and v € Var(T), T\T' /v\ is 
the term Ta where Dom{a) = {u} and va = T' . 

Example 1. Consider signature Eq composed of {{nil, 0), (aD, 1), (a>, 1), (a., 1), 
(6., 1), (>, 2), (II, 2), (x, 2), (+, 2)}. Using infix notations for operators, examples 
of terms are a.{nil),\\ {a. {nil), b. {nil)), a > {a. {nil)) G T{Eg), and aD {v),> 
{vi,V 2 ) e T{Eo U V). 

Classically (see lEEHU, EOT!), a formal system is associated to the signa- 
ture in order to define the operational semantics of closed terms. This system 
is composed of deductive rules that are used to prove (labeled) transitions be- 
tween terms. Such a set of rules together with the signature, is called an SOS 
specification. 

Throughout the paper, we assume a denumerable set of action names, A 
disjoint from EDV LI X. 

Definition 2. (SOS specification) An SOS specification (over A) is a tuple 
P = {E, R) where E is a signature and R is a set of rules of the form 

[Ti^ T(\iG 1} 

where I is finite, Ti,T(,T,T' G T{EUX), a,Ui G A. 

T^ r 

Expressions Ti T( are called the premises of the rule. 

A specification {E,R) is elementary ifX s{f) = 0 for all f G F^. 

We assume the reader is familiar with the notion of proof of a transition in 
the formal system given by the set of rules. However, a formal definition is given 
in the Appendix. We write \~^t C if P is a proof oft T (in P), and L^t t' 
if there is some proof of this transition. 

We say that two specifications P\ and P 2 agree if for all f G E\ D E 2 , 
Ei{f) — ^ s-i{f)- Given two agreeing specifications P\ and we define the union 
of P\ and P 2 , P\ ® P 2 by {EiD E 2 ,RiD R 2 ), where E\ U E 2 is standardly 
defined. Two specifications P\ and P 2 are disjoint if Ai H A '2 = 0 (and therefore 
i?i ni?2 = 0). 

Example 2. (Typical SOS Rules) Using Eo defined above, SOS-specification Pq 
over {a, 6} is given below; Po is based on several operators well known from the liter- 
ature (with prefix notation instead of usual infix one). 
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Pure Esterel (“a suspend” operator) 



aD(xi ) — > aD(xi) 



( 1 ) 



xi —* yi 



Pure Esterel (“do watching a” operator) 



a 

xi yi 



a> (®i) — > nil 



( 3 ) 



aD(ri)— » aD(yi) 



xi — » yi 



a>{xi)—* a>(yi) 



(2) 



( 4 ) 



Let y, £ {a, b}, 

Operator V of [SP95J 



X2 — > 2/2 



_ . u, 

V{Xl,X2) > 2/2 



sees, Meije,... 



fi.{xi 



Xl 



( 5 ) 



( 7 ) 



Xl — > 2/1 



V(a:i,a:2)-^ \7{yi,X2) 



fJ' 

Xl — > 2/1 



+ {X1,X2) 2/1 



(6) 



(8) 



/I 

X2 — > 2/2 



+ (a:i, X2) 2/2 



( 9 ) 



/I 

®i — > 2/1 



(a;i,a;2)-^ ||(yi,r2) 



(10) 



IJ- 

X2 — > 2/2 



(® 1 , * 2 ) 



{xi,y2) 



( 11 ) 



y y 

Xl — > 2/1 ®2 — > 2/2 



{xi,X 2 ) 



(2/1, 2/2) 



(12) 



Xl — > yi X 2 — > 2/2 



x{xi,X2)^ x(yi,2/2) 



( 13 ) 



SOS-specifications naturally describe a set of programs defined by their op- 
erational semantics, i.e. their corresponding state-transitions graphs also called 
Labeled Transition Systems. 

Definition 3. (Labeled Transition System, TS'(P), executions, traces) 

A labeled transition system (over A) is a structure S = (Q,^) where Q is a 
set of states, and Q x A x Q is the transition relation. {q,a,q') will be 
written q q' . 

Given q € Q, we write Exs{q) for the set of all executions starting from 
q and defined by all the maximal (finite or infinite) sequences of the form q = 
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<Zo ^ Qi ^ 92 ---- A trace starting from q is a word aoai... s.t. there exists an 
execution from q of the form q = qo ^ qi ^ q2.... We write Tr{q) for the set of 
traces starting from q. 

Given a specification P — {S,R) (over A), we write TS{P) for the labeled 
transition system over A associated to P and defined by TS{P) = (T(A'),— >) 

where t t' whenever \~^t t' . 



Definition 4 . (Bisimulation over labeled transition systems) 

[ParSl MilSQj Let S\ = (Qij^i) and S2 = (Q2,— *■2) be two labeled transition 
systems over A. A bisimulation relation between Si and S2 is a totoE relation 
g G Qi X Q2 s.t. ((71,(72) G Q implies for all q\ q[ there exists (72 -^2 q'2 
with ((7(5(72) G g, and vice versa, for all (72 -^2 q'2 there exists qi -^1 q[ with 
(91,92) G g. 

Since bisimulation relations between S\ and S2 are closed under arbitrary 
unions, there exists a greatest bisimulation, which we write '-^Si,S2- We simply 
write whenever S\ = S2 = S and we omit the subscripts when they are clear 
from the context. 

Specifications also describe program combinators, called contexts here; this 
is formalized in the following. 

Definition 5 . (Contexts) Given a specification P = {S,R), a if-context is 
an open term T G T{S U V) where all variables occurring in T are distinct. 
We write C{E) the set of E- contexts, and C^{E) the set of contexts of height 
h G N. We call an n-hole context a context with n variables. Zero-hole contexts 
are closed terms; they represent programs. 

Among E-contexts, those of the form f{vi, ...,Vn), with n > 0 and vi, G 
V, will be called 17-operators, and we shall write 0 (E) C C(E) for the set of 
E -operators. 

Glearly, any mapping (f> from 0 {E) to some domain of terms (maybe based 
on another signature E'), can be structurally extended to C(E) according to 
(j)[v) = V for allv€V and 4 >if(Ti, ...,T„)) = (fifi^i, ...,Vn))[(KTi)/vi]. 

In order to compare the expressive power of operators, we need a notion of 
operational equivalence between specifications. To do so, we introduce a notion 
of open bisimulation over contexts which coincides with the standard notion of 
bisimulation over closed terms (see Definition El) when equivalent contexts are 
applied to bisimilar programs. The “programs” will be given separately by an 
elementary specification (0-arity operators) describing their operational seman- 
tics. 

Definition 6 . (Open bisimulation) Let Pi = (Ei,Ri) (i = 1 , 2 ) be two specifi- 
cations. An open bisimulation between Pi and P2 is a total relation x Q 0 {Ei) x 
C{E2) s.t. for any elementary specification B — (E, R) (disjoint from Pi and P2) 



^ Vgi e Qi, 3(72 G Q2, (91, 92) G Q and vice versa. 
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the relation {(Tier, T2cr')| (Ti,T2) ^ T{S) and va ~rs(B) 

a bisimulation between TS{P\ 0 B) and TS{P2 (B B). In the following, we shall 
write T\ x ?2 instead of (Ti,T2) € x- 

DefinitionElof open bisimulation can be related to proposals in the literature: 
it generalizes the context bisimulation of and coincides with the instance 

closed bisimulation of m n m- However, we do not enter more details since an 
accurate comparison between the variants of “open bisimulations” is a topic in 
its own and would be out of the scope of this paper. 

We say that two specifications Pi and P2 are equivalent, written Pi = P2, if 
there exists an open bisimulation between P\ and P2 ; = is indeed an equivalence 
relation. Notice that we do not require any congruence property for =: indeed, 
open bisimilar contexts do not necessarily remain equivalent when instantiated 
by bisimilar sub-contexts. The congruence property can be achieved by two- ways 
explicit syntactic mappings between contexts, leading to a notion of effectively 
equivalent specifications. 

Definition 7. (Effectively equivalent specifications) We say that Pi and 
P2 are effectively equivalent if there exist two mappings 4>i2 : 0 {Bi) C{U2) 

and 021 : 0(^2) C{Ei) such that the relation x = {{T, 4 >i 2 {T))\T € C'(X'i)}U 
{(021 (S'), 5”) I S' S 0(1^2)} is an open bisimulation between P\ and P2 

Effective equivalence of specifications is a strong notion with concrete appli- 
cations as it provides us with a translation between programs (i.e. closed terms) 
in a compositional way. 

2.2 The Format 

Confining to particular rule formats enables to characterize precisely the class of 
operators we are able to treat. In this paper, we consider a subclass of so-called 
tyft format operators of [CV88ICV92j . This subclass is called the xyfg/xyfz for- 
mat. 

We assume given a specification P = {B, R) whose rules have the general fol- 
lowing form: 

{P ^ y^\iG 1 } 

(14) 

f{Xl, ...,Xn) T 

where / is a finite set of indexes, f G E, Xj {1 < j < n) and yi {i G I) are 
all different variables from X, ai,a G A, Ti,T G T{S U X). We then say that 
r is o rule for /, and we define op{r) = /, lht{r) = f{x\, ...,x„) (for “left-hand 
term”), rhtfr) = T (for “right-hand term”). Also, we let Varfr) denote the set 
of all variables occurring in lht(r). 

Definition 8. (xyfg/xyfz format) An SOS rule is in the xyfg/xyfz format if 
it is of the form 1 1 h\) or (Eg) below (where I C {!,... ,n}, (/, n) G E and the 
Xj ’s and the yi’s are distinct variables): 
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Basic rules 

{xi yi \ i e 1} 

^ 

where {g,m) € E, and {z\, z^} Q {xj}i<j<n^ {yi}iei o,re distinct vari- 
ables s.t. Vz G I, {xi,yi} 2 {zi, ■■■, Zm}- Therefore m<n. 

Projective rule 

{xi yi \ i e 1} 

^ ( 16 ) 

f{xi,...,x„) — > a 

where z G {xj}i<j<n U {yi}zei 

All the rules of Example E| are xyfg/xyfz : Rules lO, 0, Q, (0 and ® 
are projective, the others are basic. In operator (nil, 0) G Eq corresponds to 
(g,m) in dTH). 

The xyfg/xyfz format is at the crossing of several formats in the literature : 
first, it is a fragment of the pure tyft format of [rrvT72| . However, in tyft rules 
as in d, term T can be any open term whereas in an xyfg/xyfz rule, term 
T G 0(E). Also, our format respects the definition of jPinhl ) referred to as 
“without copy” format. Copying allows several occurrences of the same variable 
X (or occurrences of derived variables y in the premises) to occur in the right 
hand side term of the rule. Finally, xyfg/xyfz format strictly generalizes the 
basic format of |Rn92j . As far as we know, most of the well-known description 
languages of the literature (e.g. Meije, SCCS, Esterel, Signal, ...) bear an SOS- 
specification description in this format (see Example ED. 

An operator of a given specification is an xyfg/xyfz operator if all the rules for 
it are xyfg/xyfz, and we call an xyfg/xyfz specification (or simply “specification” 
for short in the following) any SOS-specification P = (E, R) s.t. every / G A is 
xyfg/xyfz. 

3 A Taxonomy for Preemption 

The notion preemption has been widely associated with interrupts, process sched- 
uling and operating systems. For instance, killing and suspending processes (for 
example "c or ~z in the UNIX operating systems) have been used widely in op- 
erating systems and hardware description languages. However, it is only in the 
works of G. Berry on Esterel, one finds the notion of preemption elevated to the 
status of an operator explicitly. In fact, argues the need of considering 

preemption and concurrency as orthogonal features in reactive language specifi- 
cations. The two primary categories of preemptive operators are the suspensive 
and abortive operators. In fact, these operators can be seen in the new look- 
ahead architectures abundantly (look-ahead or IA-64 architecture). In terms of 
the evolution of the state processes, besides execution (transition from one state 
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to the next, different one), we would have suspension (transition while remaining 
in the same state) and abortion (transition with loss of any state information 
about the process). From this point of view, operators intuitively considered not 
preemptive can nevertheless be regarded as performing suspension or abortion. 
For example, the sequence operator, as in PI ; P2, can indeed be understood 
as an operator that “installs” both processes, P2 in a suspended state initially, 
suspends P2 as long as PI executes, and aborts PI when it terminates, while P2 is 
resumed. Of course it might not at all be implemented that way, and this point 
might be seen as counter-intuitive (in classical architectures) . But if one consid- 
ers a system where installing a process would take more time than resuming it, 
then this way of achieving sequence might be in favour of reaction fastness. And 
anyway our concern here is that the behaviour defined is the one where first that 
of PI is observed, and then that of P2. 

The intuition behind the use of a syntactic criteria lies in the consideration 
that abortion is characterized by the definitive loss of information 

In this section, preemption is characterized in a syntactic way: this charac- 
terization is based on the syntax of the rule schemes describing the operators. 

According to the intuition, a unary operator, like operator aD (v) in Exam- 
ple El has a suspensive feature if there exists a rule for it, namely Rule © , in 
which operator leaves its argument unchanged by the transition. This can be 
generalized to higher arity operators, see below. On the other hand, abortive 
operators would be characterized by SOS-rules in which no reference to the ar- 
gument, or to any of its derivatives, will occur in the target term. Definitions 
below also introduce “relax” operators as a natural complementary notion of the 
suspensive and abortive ones. 

Definition 9. (Suspensive, abortive and evolutive rules) A rule is said 
to be suspensive with respect to the j-th argument (or w.r.t. j for short) if 
Xj S Var(rht(r)). It is abortive w.r.t. the j-th argument (or w.r.t. ]) if Xj ^ 
Var(rht(r)) and (if Uj is defined) yj ^ rht(r). Finally, it is evolutive w.r.t. the 
j-th argument (or w.r.t. j^ if j S I and yj S rht(r). A rule is relax if it is 
evolutive w.r.t. all its arguments. 

In Example ( 0 , Rule® (resp. 0 ) is suspensive (resp. evolutive) w.r.t. 1. 
Rule © is abortive w.r.t. 1, and Rule © is relax. Rule © is abortive w.r.t. 1 
and evolutive w.r.t. 2, and Rule © is evolutive w.r.t. 1 and suspensive w.r.t. 2. 
Rule 0 is suspensive w.r.t. 1. Rule (0 (resp. ©) is abortive w.r.t. 2 (resp. 1). 

Etc... 

The taxonomy of DefinitionElextends to (A'-)operators. We obtain definitions 
for preemptive (more precisely suspensive or abortive) and non-preemptive (also 
called relax) ones: 

Definition 10. (Suspensive, (pure) abortive and relax operators) 

Let f be an xyfg/xyfz operator. We say that f is 

— suspensive if it has a suspensive rule w.r.t. some j and no abortive rule. 

— abortive if there exists a rule for f abortive w.r.t. some j 

— relax if all the rules for f are relax. 
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We write S (resp. A, TZ) the sub-class of suspensive (resp. abortive, relax) 
operators. Sub-classes S, A and TZ form a partition of xyfg/xyfz operators. An 
operator is preemptive if it is suspensive or abortive. Among abortive operators, 
we distinguish so-called pure abortive ones defined as abortive operators with no 
suspension feature, i.e. with no suspensive rule for it. In the following, let Ap 
denote the class of pure abortive operators. 

Refering to our example: oD, /r. and || are suspensive, a> and -I- are pure 
abortive. > is abortive but not pure abortive as it possesses a suspensive rule 
(see Rule ©)• Finally, x is relax. The sequence operator mentioned above would 
have a suspensive rule, as well as an abortive rule, and hence, as an operator, it 
is to be considered abortive. In fact, such a flexibility allows the interpretation 
in various architectures (including, lookahead architectures). 

In the literature, mostly classes S and Ap have been considered. In the Es- 
terel language |13erh,3| . the “a suspend” operator belongs to S. It describes the 
suspension of a process at a signal occurrence (here for signal a). The various 
versions of do watching as well as the trap belong to Ap. However, non pure 
abortive operators arise very naturally : in Eznsi, operator y describes the 
abortion of a process by the starting of another process. 

4 Expressiveness Issues 

We now address expressiveness issues for the classes of operators S and A. The 
main result of this paper shows that the class S of suspensive operators is strictly 
more expressive than the class A of abortive ones. 

Basically, we show how to replace abortive operators by suspensive ones. To 
do so, we replace every abortive rule by a suspensive one, by keeping track of 
the absorbed variables in the right hand term of the rule. Therefore, the arity of 
the right hand term operator has to be increased. However, increasing the arity 
cannot be made locally for each rule, but has to take into account all the rules 
in which the concerned operator is involved. To do so, we give the specification 
a canonical decomposition into sub-specifications, each of them being associated 
an increase of the arity for each operator. By transforming uniformly all the 
abortive operators into new suspensive ones with greater arity and new rules, 
we show that the obtained specification is effectively equivalent to the original 
one. 

Let P be a specification. We first explain how to decompose P: define the 
operators graph of P, written G(P), as the non-directed graph whose vertices 
are all the function names / G Up and whose edges are the pairs (/, g) s.t. there 
exists a rule r with lht{r) of the form /(...) and rht(r) of the form <?(...), or vice 
versa. 

P is connected if G(P) is connected. Otherwise, decompose G(P) into con- 
nected components {S \, ...} (Note that there might be infinitely many A^’s.) and 
partition R according to the following equivalence relation: “op(ri) and op{r2) 
belong to the same connected component . Denote the partitions of the equiva- 
lence relations by {Pi, P 2 , •■•} where the indices for Rk’s have been chosen in a 
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such way that r G iff op{r) G Sk- Clearly, P = 0^ Pk, where Pk = {Sk,Rk)- 
Such a decomposition is unique and is referred to as the canonical decomposition 
ofP. 

Let E = {F,E) be a signature with bounded arity function, and define N 
as the maximal element of the set {V(/) | / G E}. We define Ejn by Fjjtn = 
{/t« I / G Fs} and Estn ■ F^tn N delivering the constant value N. 

Finally, for a rule r, we define abs{r) C X the set of absorbed variables in 
r, to denote all the variables Xk that appear in lht{r), s.t. neither Xk nor pk 
(corresponding to a premise if any) occur in rht{r). For example, in Rule ©, 
the set of absorbed variables is {a;i}. 

We now define an extension of a connected specification: 

Definition 11. (The Wextension of P) let P — {E, R) be a connected spec- 
ification and assume the set {V(/) | / G E} has a maximal element N. 

We define the iV-extension of P, written Pin, by 

— case (1) If for all f G Fs, Es{f) = N then Ep^j^ ='^ Etn, 

case (2) otherwise, there exists f G Fs s.t. E s{f) < N (and therefore 

N > 1), then Ep^i.^ = Etn U {(7T, iV)}, where II is a new function 
name. 

— contains the rules (we write abs{r) to denote the tuple of variables in 

abs(r) ordered by increasing indexes) : 

Extended basic rules : if r is like (USD then r\N is (Xn+i, ...,xn are fresh 
meta-variables) 



{xi^ yi\iel} 



/Tiv(a:i,...,2:„,Xn+i,...,XN) — » 9jn{zi , Zm, obs(r-), Xn+i, ...,xn) 



(17) 



Extended projective rules : if r is like TW^ thenr-[N is (Xn+i,...,XN are 
fresh meta-variables) 



{Xi 



yi\i G 1} 



/Tiv( 2 :i,...,a:„,Xn+i,---,XN) — > Il(«, o6s(r), Xn+i, xn) 



(18) 



If R contains a projective rule, then also add the following rule (referred 
to as *-rule): 



Xl 



yi 



I1(xi,X2,...,xn) — > n(yi,X2, ...,xn) 



(19) 



For example, consider P, composed of rules (jSD and o for operator V. Then 
the maximal arity being 2, and Rule 03) being projective, Rrz is then 
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V(xi,X2) 



(20) 



(21) 



n{y2,yi) 



n{xi,x2) 



n{yi,X2) 





(22) 



V(yi,a;2) 



Extending connected specifications leads to only suspensive or relax operators 
(By construction abs{j\N) = 0 for all r G R. Noting that abs{r) = 0 iff for every 
j, r is suspensive or evolutive w.r.t. j, concludes the proof.). Also, P is connected 
and has no abortive rule entails that P and Pin are isomorphic (see definition 
in the Appendix). 

Theorem 1. Let P be a connected specification with hounded arity (call N the 
maximum), then P and Ptn are effectively equivalent (See Appendix for the full 
proof). 

Notice that specification P^n fits the basic format of [fil)D2] . Theorem but 
mostly its generalization in Theorem El shows that, assuming the =-equivalence 
criterion is accepted, the xyfg/xyfz format is not more expressive than the basic 
format. The translation from P to Pt« would precisely deliver the same basic 
format presentation of CCS proposed in with moreover a proof that it is 

correct with regard to the original specification of CCS in jMil81IMil89j . 

4.1 Abortion Can Be Simnlated by “for Ever” Suspension 

Let STZ- specification be a specification that contains only suspensive or relax 
operators. We show how to transform any specification into an equivalent STZ- 
specification. This is formalized by Theorem El Note that the result also holds for 
infinite specifications provided connected sub-specifications have bounded arity. 

Theorem 2. Let P be a specification (with bounded arity for each connected sub- 
specification ofP). Then there exists an STZ-specification, P, which is effectively 
equivalent to P . 

Proof (sketch): We use Theorem ^ for each component of the canonical 
decomposition of P; see Appendix for the full proof. 

TheoremElis powerful because it delivers a constructive way to translate any 
abortive operator into suspensive and relax ones. Moreover, extending the arity 
of operators w.r.t. the connected sub-specification they belong to minimizes the 
arities in the extended specifications. This technique also applies to infinitely 
many connected specifications even if no global maximal arity in the signature 
exists. 
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4.2 Suspension Is Primitive 

We prove here that class S is strictly more expressive than class Ap in the sense 
that it is not possible to express suspension by means of pure abortive and/or 
relax operators (the question for non pure abortive operators is irrelevant because 
non-pure abortive operators have necessarily some suspensive feature, which 
trivially can be used to encode suspension) . We show the result in the framework 
of “terminating” programs, - i.e. programs with only finite executions. In this 
framework, we first prove that abortive and relax operators preserve termination. 

Proposition 1. Let P = (S,R) be an ApTZ-speeifieation s.t. for any program 
5, i.e. any (5,0) G E, all executions from S are finite. Then, for all t G T{E), 
any execution tt G ExTS(P){t) is also finite (See Appendix for the proof). 

We now build an 57?.-specification P s.t. there is no ^p7?.-specification equiv- 
alent to P, not even “trace” equivalent: consider Psusp the 57?.-specification 
which signature is {(nzZ, 0), (d, 1)} and which rules are (d) and (0 of Exam- 
ple |5| Clearly, program aD{nil) possesses an infinite trace, namely a“. Thanks 
to Proposition no ending yIp7?.-specification can be used to build a program 
which is trace equivalent to oD (nil), as all its executions would be finite. This 
concludes the proof. 

5 Conclusion - Debate 

In this paper, we have characterized the notion of preemption widely used in 
reactive and real-time programming languages in the context of SOS specifica- 
tions. As far as our knowledge goes, this is the first attempt of characterization 
of preemption in a formal setting rather than invoking virtual/real operating 
system dependent intuitions. The proposed framework has led to the natural 
categorization of the suspensive and abortive operators that exist in the various 
synchronous languages. 

An interesting improvement of this work would be to consider semantic ar- 
guments in between the intuition underlying the preemptive concepts and the 
syntactic categorization proposed here; we believe that a notion of control at- 
tached to processes can achieve this issue. Such solution would give a formal 
rationale to the Definition E3 The latter nevertheless already contains semantic 
motivations, briefly presented in the introduction of Section 01 We conjecture 
that any attempt in defining the preemption concepts at a more semantic level 
would agree with the syntactic taxonomy we defend here, when considered in an 
SOS setting. 

At least, the syntactic categorization provide us with a clear mathematical 
framework to tackle expressiveness issues: we have shown that suspensive oper- 
ators are strictly more expressive than abortive operators, and shown suspensive 
operators to be primitive and the abortive to be non-primitive. 

The translation of abortive operators into suspensive ones has been made 
possible through the notion of effective equivalence between specifications. We 
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argue that effective equivalence between specifications, see Definition [3 is the 
good notion to achieve both constructive and compositional translations of vari- 
ous operators. Indeed, this makes it possible to reason about operators indepen- 
dently of their context of use. 

Finally, it should be remarked that we did not say anything about complexity 
issues, as only expressiveness aspects are treated here. It should be noted that the 
present contribution gives a clear comparisons between general infinite classes 
of operators but also between particular (finite) sets of operators. Theorem 0 
is general enough to show first, that the xyfg/xyfz format is in fact not more 
expressive than the basic format of Badouel and Darondeau, by means of the 
transformation of a given specification P into P. 

Expressiveness questions still remain to be answered. For example, since sus- 
pensive (and relax) operators are now proved to be expressive enough, one can 
think of a strict minimal expressive complete sub-class of suspensive operators 
from which any kind of preemption could be derived. In particular, is there any 
“primitive” suspensive operator s.t. any other operator would be captured, using 
simple constructions? If so, then this would be another very strong argument in 
favor of Definition rmi 
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Appendix 



Definition of Proofs 

Let P = (S,R) be a specification. We use 0,0i,... to denote transitions in 
P, and T,Ti,... to denote expressions in rules. A proof oi a transition 0 in P 
is a finite tree, V, whose edges are ordered and whose vertices are labeled by 
transitions in P s.t. : 



— the root is labeled by 0, 

— if 0' is the label of some vertex and 0\, ..., 0m are the labels of its children 
(no children when m = 0), then there is a rule r € R of the form 



Tl ... Tm 

T 



and a substitution ct s.t. 0i = Tia and 
0' = Ta. 



We write \-^0 if P is a proof of 0 (in P), and \-^0 if for some proof 
V. We omit the subscripts when they are clear from the context. Sub-proofs 
are sub-proof-trees. 



Isomorphic Specifications 

Two specifications Pi = (Si, Ri) (i = 1,2) are said to be isomorphic if there 
exists a bijection 6 ■. Xi ^ S 2 s.t. for each rule ri € Ri, the rule obtained 
by substituting every rule of Ri; that is, isomorphic specifications are the same 
modulo renaming of operators. 
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Proof of Theorem m 

If P contains no abortive rule the result is trivial since P and Ptn are isomorphic. 
We can now assume that P has at least one abortive rule, entailing N > 1. 

Let define (f) and 4>' by 4>{v) = v for all v & V , 

<p{f{vi,...,Vn)) = /tn(ui, ...,Vn) for all (/,n) G A" where Vn+i, ...,vn 

are new variables, and = v for all v gV, (/)'(/tn(ui, ..., uat) = 

f{vi, ...,Vn) for all (/, n) G S. Finally, define (f)'{n{v\, ...^vn) = v\. Now extend 
(j> (resp. 0') to C{E) (resp. C{Epjk) by : 

To prove that (j) and (j)' provide us with the desired result, we first establish 
the intermediate following result: 

Proposition 2. Let x C C{Ep) x C{EpyN) be the least relation s.t. 

— V X V for all v G V , 

— X /Tiv(S'i,...,S'Ar), for all (/,n) G S, all Tj x Sj (j = 

and all r , ■ • ■ , S n ? 

— T X n{S, S 2 , Sn) for all T X S and all S 2 , Sn- 
Then relation x is an open bisimulation between P and Ptn. 

Proof. Let B be an elementary specification, disjoint from P and Ptn. 

We define T{Ep U Sp) x r(L'pfjv U Ep) as the least relation such that 
^2 ~ts(b), and -- /tn(si, s„, s„+i, ..., sat) whenever (/,n) G Ep, 

tj ~ Sj, for all j = 1, ..., n, and t ^ II{s, S 2 , SAf), whenever t ^ s. 

It is enough to prove that ~ is a bisimulation between TS{P 0 B) and 
TS{Pn®B). 

Let t ~ s, and let t' . We show that there exists s' s.t. 

t' ~ s' . The proof is done by induction over the height of the proof V, written 
h{V). 

h{V) = 0 : The only rule r in P is an axiom. If r G Rp, then t = (3 G T{Ep) and 
s = P' G T{Ep) for some /3 ~ts(b) P' and it is trivial. Otherwise, r G Rp is 
of the form 



f{xi,...,Xn) ^ rht{r) 

and t is some f(ti,...,tn) and s is some /tn(si, ..., sa?) with all tj ~ sj 
(j = l,...,n). P is obtained by applying rule r to t with substitution p 
where XjP = tj and t' = rht(r)v. 

Define v' by Xkv' = Sk for all k = 1, N. Rule nw together with p' gives a 

proof h:^,”®'®s rht{r-\N)p' . By construction of p' , t' ~ rht{r-\N)p' (Check 

the two cases when rule r is basic or projective, i.e. where rht(r) is some 
g{zi, ..., Zm) or some z). 
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h{V) > 0 : This case is very similar to the previous one, with an additional 
induction argument for sub-proofs in V. We do not give the proof here. 

Reciprocally, let s'. We show that there exists some 

with t' ~ s'. Again we use an induction over the height of V', h{V'). 

h{V') = 0 : Proof V' is based on some axiom. If this axiom is in Rb then s = 
f3' G T{Sb) and t = [3 £ T{Sb) with [3 ^ts(b) P' and it is trivial. Otherwise, 
the axiom belongs to Rp and is necessarily some nw (because Rule(*) is not 
an axiom). Proof V' is obtained from nw and a substitution p' . W.l.o.g. 

assume s has the form /tn(si, ..., sjv). Define the substitution v : X ^ 

T{Ep U Eb) by XjV = tj for all j = 1, ...,n. Applying v to rule r gives a 

proof V s.t. hp®®t t' . By construction of t' ^ s' (consider the two 

cases where pn is extended basic or extended projective). 
h{V') > 0 : W.l.o.g. we can avoid the cases where s is some II{si, S 2 , ■■■, sn) 
since 

\^pi n{si, S 2 , ■■■, sn) s' iff s' = n(s'i, S 2 , ■■■, sn) and hp/zsi sj with 

h(V") < h(V'), and t' ~ 77(s(, S 2 , ..., sn) iff t' ~ s'^. 

The case s = /tn(si, ..., sjv) is very similar to the previous one, with an 
additional induction argument for sub-proofs in V' . It is straight forward. 



Now, we can build a proof by induction over the structure of open terms that 
T X 4’i"r) X S, for all T £ C{Ep), S £ C(Epj„), which concludes. 

Proof of Theorem 

Let P = (E,R) be a (xyfg/xyfz) specification with bounded arity and write 
dsf 

N = maa;{Vx'(/) | / G E}. Let P — 0^ Pk be the canonical decomposition of 
P and let Nk be the maximal arity of Pk- 

-V d/Gj 

Definition 12. We define P = 0fe(TfcTNfc), where in each P^n^. the possible 
additional symbol U is indexed by k. 

Write for the canonical decomposition of P, each Pk with maximal 

arity Nk- 

Call fk and fj. the mappings in the proof of Theorem E when applied to 
Pk and define f : C{Ep) C{Ep) by 4>{T) = T ii T £ V and </'(T) = 
^^>fc(/(^'l, ■■;VNk))[f(Tj)/vj]i<j<n if (/, n) G Ek and T = f(Ti, ...,T„). 

Notice that f and fk coincide over C{Ek). (Also we can define a mapping 
4>' to come back from PtN in a similar way). It remains to prove that T and 
4>{T) are open bisimilar. We reason by induction over the maximal number of 
alternations of function names belonging to different Ek in T : for T £ C{E)^ 
define alt(T), the maximal number of alternations in T by (write T(e) for the 
topmost symbol of T) as follows: 
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- alt{T) = 0 if T e C{Sk), 

— alt\f {Ti, ...,Tn)) = max{alt(Tj) | 1 < j < n} if f,Tj(e) G Ek for some k, 
otherwise 1 + max{alt{Tj)\Tj (e) G Ek> and / ^ Ek>} 

For example, alt{\\ {a.{nil),v)) = 2, alt{+{vi,+{v 2 ,V 3 )) = 0. 

If alt{T) = 0, then necessarily T G C{Sk) for some k and by Theorem 
and (j>k{T) are open bisimilar. Otherwise alt{T) = h + 1 with h > 0. Then T 
is some Z-holes Iffc-context (with, say, I variables vi,...,vi) applied to contexts 
Ti, ..., Ti with less number of alternations. Formally T = where G C{Ek) 
and Vjp = Tj G (Fi) with h' < h. 

Then by definition of cj>, <p(T) = (j)k{T'^)[(p{Tj) / vj]. 

Let B be an elementary specification. To show that Ta and 4>{T)a' are bisim- 
ilar for all substitution a,a' : V ^ T{Eb) with va '-^ts{b) va' , we use the 
induction hypothesis over Tj and 4>{Tj), that is Tj and 4‘i'^j) ^re open bisimilar. 
Then Tja ~ 4>(Tj)a' . 

Consider the new elementary specification B' composed of programs Tja G 
T{EU Eb) and (j){Tj)a' G T{Ep U Eb)- By applying Theoremdto T^, we have 
~ for substitutions ■■ V ^ T{E U EB)[jT{Ep U Eb) s.t. 

Vj^ = Tja ^ (j){Tj)a' = Vj^' . 

Because = Ta and 4>k{T^)^ = (t>iT)a' we conclude that Ta ~ cj>{T)a' . 

For lack of space we do not give the proof that ()>' {S) and S are open bisimilar 
for some suitable definition of mapping 4>' : 0(T'pf„) ^ C{Ep). 

Proof of Proposition^ 

By induction over |t|, the height of t. 

If |t| = I, then t is a constant, and we are done by assumption. 

Otherwise, let t be s.t. |t| > 2 and let tt G Ex{t). Assume tt is infinite, of the 
form t = to ^ ti ^ The xyfg/xyfz format ensures that \tj\ < |t| for all 
j G N. 

Now because P is an Ap7?.-specification, only abortive or relax rules apply 
along a proof of tt, among which only a finite number of abortive ones (as we 
cannot infinitively decrease the height of (finite) terms. 

Then there exists k s.t. a proof of tt' tk tfc+i-- only uses relax rules. 
Clearly tt' is an infinite execution if tt is. 

By assumption tk cannot be a constant, it is then of the form fi{ui, 
with n > 1, and because only relax rules are applied, all the following ti’s are 
some fi{u{, ...,u”). 

1 

Now a proof of tk tk+i relies on proofs that ^ (for all i = 

1, ..., n and some 6^"'’^), as well as a proof of tk+i tfe +2 relies on proofs that 

j,fc + 2 

^k+i (for all i = l,...,n and some and so on and so worth. 

Therefore we can build a proof of some infinite execution from, for example, u\, 
but because |mJ| < \tk\ < |t|, it contradicts the induction hypothesis. 
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Abstract. Two most commonly used classifications of reference locality 
are: temporal locality and spatial locality. This paper introduces a new 
class of reference locality, called Regional Locality, which is the program 
behavior that a set of addresses which are accessed in one critical or 
non-critical region will be very likely accessed as a whole in the same 
critical region or other non-critical regions. We proposed three updates 
propagation protocols based on Regional Locality in Distributed Shared 
Memory systems. These protocols include: Selective Lazy /Eager Updates 
Propagation protocol, First Hit Updates Propagation protocol, and Sec- 
ond Hit Updates Propagation protocol. Our experimental results indicate 
that Regional Locality exists in executions of many Distributed Shared 
Memory concurrent programs. We have shown that the proposed pro- 
tocols outperform the existing updates propagation protocols based on 
temporal locality. Exploring Regional Locality in other shared memory 
systems would be an interesting future research direction. 

Key Words: Distributed Shared Memory, Temporal Locality, Regional 
Locality 



1 Introduction 

Reference locality H3| in program behavior has been studied and explored ex- 
tensively in memory design, code optimization, multiprogramming, etc. There 
are two broad classifications of locality: temporal locality, which means an ad- 
dress accessed in the past is likely to be accessed in the near future; and spatial 
locality, which says an address nearby in memory space to the one just accessed 
is likely to be accessed in the near future. In addition to temporal locality and 
spatial locality, many Distributed Shared Memory (DSM) concurrent programs 
exhibit the third kind of reference locality - Regional Locality in their executions. 
Before explaining Regional Locality, we need to give a brief introduction of DSM 
systems and regions in executions of DSM programs. 

A DSM system provides application programmers the illusion of shared mem- 
ory on top of message passing distributed systems, which facilitates the task of 

* The author’s current address: Dept, of Computer Science, University of Otago, P.O. 
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parallel programming in distributed systems. Many DSM systems require explicit 
synchronization primitives in DSM concurrent programs in order to optimize 
their performance. An execution of a DSM concurrent program can be viewed 
as a sequence of regions which are delimited by synchronization primitives, such 
as acquire, release and harrier, as shown in Fig^ A critical region begins with an 
acquire and ends with a release, while a non-critical region begins with a release 
(out-most one in nested critical regions) or a harrier and ends with an acquire 
(out-most one in nested critical regions) or a harrier. We say two critical regions 
are the same if both of them are protected by the same lock. 



delimiter(acquire,release, barrier) 
region 

(critical or non-critical region) 

delimiter(acquire,release, barrier) 
region 

(critical or non-critical region) 



delimiter(acquire,release, barrier) 
region 

(critical or non-critical region) 



Fig. 1. Region-based view of program execution 



Regional Locality is the program behavior that a set of addresses which are 
accessed in one critical or non-critical region will be very likely accessed as a 
whole in the same critical region or other non-critical regions. For instance, in 
a page-based DSM system, suppose processor Pi enters a critical region and 
accesses pages {mi, m2, ..., m„} during the execution of the critical region, and 
processor P2 enters the same critical region afterwards. P2 will most likely access 
pages {mi, m2, ..., m„} during the execution of this critical region, since the 
same critical region usually protect the same set of data objects. The similar 
behavior also exists in non-critical regions of a DSM program. For example, 
suppose processor Pi enters a non-critical region and accesses pages {mi, m2, 
..., m„} during the execution of the non-critical region, and processor P2 enters 
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another non-critical region afterwards. Since data objects accessed in a non- 
critical region often migrate together from one processor to another processor, 
which is regulated by the programmer to avoid data race in non-critical regions, 
when P 2 accesses one or two members of the page set {mi, m 2 , ■■■, it will 
very likely access every member of the set {mi, m 2 , ■■■, m„}. 

Regional Locality is similar to temporal locality in the aspect that it acquires 
the knowledge of locality from the past execution of the program. Their differ- 
ence is that temporal locality uses all the addresses accessed by a processor in 
the past as one locality group for the processor itself, while Regional Locality 
divides into groups the addresses accessed by a processor in the past according 
to their occurring program regions and uses these groups as locality groups for 
all processors. Like other kinds of locality, Regional Locality can also be explored 
to improve performance of DSM programs. In this paper we focus on exploring 
Regional Locality in updates propagation in DSM systems. 

The rest of this paper is organized as follows. In Section E| we propose updates 
propagation protocols based on Regional Locality. Our proposed protocols are 
compared with related work in Section 0 Experimental results are analyzed and 
discussed in Section 0 Finally, the major contributions of this paper and future 
work are presented in Section 0 



2 Updates Propagation Based on Regional Locality 

Many weaker sequential consistency models have been proposed for 

DSM systems. The goal of these models is to achieve Sequential Consistency ini 
on networks of workstations as efficient and convenient as possible. These models 
can take advantage of explicit synchronization primitives, such as acquire, release, 
and barrier, to select the time, the processor, and the data for making shared 
memory consistent M- 

Even though weaker sequential consistency models can improve performance 
by reducing messages for memory consistency, there are still a large number of 
messages for updates propagation, which significantly affect the performance of 
the DSM systems 0. 

A DSM updates propagation protocol determines when and how updates on 
one copy of a page are propagated to other copies of the same page on other 
processors. Updates on a page can be represented by a single-writer scheme or by 
a multiple-writer scheme 0 . An updates propagation protocol can be integrated 
with either a single-writer scheme or a multiple-writer scheme. 

There have existed a number of different protocols for propagating updates 
in DSM systems 0. One protocol, adopted by the TreadMarks DSM system P, 
works as follows: when an old copy of a page needs to be renewed, the old 
copy is invalidated first; only when the invalidated old copy is really accessed 
by a processor and a page fault occurs, are the updates of the page sent to the 
processor. We call this protocol as the Lazy Updates Propagation (LUP) protocol 
since it propagates updates lazily when updated pages are accessed. 
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In Lazy Updates Propagation each page fault involves an updates requesting 
message to a remote processor, and an updates propagating message from a re- 
mote processor. The large number of messages caused by page faults influence 
seriously the performance of DSM systems. If we can prefetch and apply the up- 
dates of several pages in a single page fault, we can reduce page faults and the 
messages caused by page faults. In this way the performance of the DSM system 
will be significantly improved. The challenge here is that we should prefetch as 
much useful updates as possible while avoiding prefetch of useless updates prop- 
agation. It is important to be aware that prefetch is a double-edged sword in the 
sense that prefetch of useful updates can improve performance while prefetch of 
useless updates may on the contrary seriously degrade the performance. How- 
ever, it is non-trivial to detect which updates are useful and which ones are 
useless to a processor. 

In the following sections we use Regional Locality as heuristics to detect 
which updates will be needed in the future execution of a processor. Based on 
this knowledge we prefetch useful updates in our novel updates propagation 
protocols. To make these protocols concrete, we describe them in the context of 
Lazy Release Consistency (LRC) model jl I) . 

The LRC model is an improvement of the Eager Release Consistency (ERC) 
model |B|. Both ERC and LRC are called Release Consistency (RC) models. 
The RC models take advantage of explicit synchronization primitives, e.g., ac- 
quire, release, barrier, to optimize the memory consistency protocols. The ERC 
model requires that shared memory updates be propagated outward at release 
primitives, while the LRC model postpones updates propagation till another 
processor has successfully performed an acquire primitive. At successful acquire 
primitives, the DSM system is able to know precisely which processor is the 
next one to access the shared data, so updates can be propagated only to that 
particular processor (or no propagation at all if the next processor is the current 
processor). Therefore the LRC model can reduce more messages than ERC in 
the system. 

It is worth pointing out that the ideas in the following proposed protocols 
are independent of any consistency models. 

2.1 Updates Propagation in Critical Regions 

In updates propagation we are only concerned about the updated pages whose 
updates need to be propagated. To explore Regional Locality in updates propa- 
gation in critical regions, every lock in a processor is associated with a Critical 
Region Updated Pages Set (CRUPS) which stores pages updated in a critical 
region. The CRUPSs actually keep the knowledge of Regional Locality in critical 
regions. A CRUPS is formed as follows. Before a processor enters a critical region 
by acquiring a lock, an empty CRUPS is created for the lock. If the processor 
updates a page during the execution of the critical region, the identifier of the 
page is recorded into the CRUPS of the corresponding lock. When the processor 
exits from the critical region, it stops recording in the CRUPS, but keeps the 
the contents of the CRUPS for use in the next acquisition of the same lock. 
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According to Regional Locality, we know when a processor enters a critical 
region it will very likely access the pages previously updated in the same crit- 
ical region. So when a processor P2 enters a critical region by acquiring a lock 
from another processor Pi, Pi can assume that P2 will access the pages in its 
CRUPS of the lock and thus piggy-backs the updates of these pages on the lock 
grant message. This idea is essentially a data prefetching technique based on the 
acquired knowledge of Regional Locality. 

Based on the above idea we propose a hybrid updates propagation proto- 
col, called the Selective Lazy/Eager Updates Propagation (SLEUP) |l 4 |. This 
protocol can be precisely specified as follows. 

Protocol 1 The Selective Lazy/Eager Updates Propagation (SLEUP) protocol 

For any pair of processors Pi and P2 in a DSM system, suppose Pi has left a 
critical region by releasing a lock L, with a Critical Region Updated Pages Set 
CRUPSl for lock L, and P2 is the next processor to enter the same critical 
region by acquiring lock L. The updates made by Pi are propagated to P2 as 
follows: 

1 . At the entry of a critical region, 

(a) updates of these pages whose page identifiers are in CRUPSl are prop- 
agated from Pi to P2, and all corresponding copies at P2 are updated; 

(b) invalidation notices of these updated pages whose page identifiers are 
not in CRUPSl are propagated from Pi to P2, and all corresponding 
copies at P2 are invalidated. 

2 . During the execution of the critical region in P2, when an invalidated page is 

accessed, a page fault triggers the propagation of the updates of the missing 
page from Pi to P2. □ 

To illustrate how SLEUP works, we give an example in Fig. 0 Suppose Pi 
reads x and writes y, P2 reads y and writes x, in the same critical region. For 
the first time when Pi enters the critical region, its write on y is detected and 
therefore y is recorded into CRUPSi. When P2 acquires the lock, Pi piggy- 
backs updates of y, whose identifier is in CRUPSi, and invalidation notice of z, 
whose identifier is not in CRUPSi, on the lock grant message. When P2 receives 
updates of y and invalidation notice of z, it updates its copy of y and invalidates 
its copy of z. During the execution of this critical region at P2, the write on 
X is detected and therefore x is recorded into CRUPS^- When Pi acquires the 
lock again, P2 piggy-backs updates of x, whose identifier is in CRUPS 2 , on the 
lock grant message. From this example, we know if the Regional Locality holds 
in critical regions, there is no page fault during the execution of critical regions 
and the number of messages is reduced by prefetching the updates of the to-be- 
accessed pages in SLEUP. The experimental results in Section 0 demonstrate 
the effectiveness of the SLEUP protocol. 
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PI P2 

write(z), Updates in Non-CR */ 

Acquire! 1); /* CRUPS = { } */ 
read(x); 

write(y); /* CRUPS = { y ) */ 

Release! 1); 




Release!!); 

upd!x) 

/* CRUPS = { 1 */ 

/* CRUPS = { y ) */ 



Acquire!!); 

read!x); 

write!y); 

Release!!); 



Fig. 2. An example for the SLEUP protocol 



2.2 Updates Propagation in Non-critical Regions 

To explore Regional Locality in updates propagation in non-critical regions, we 
detect the pages updated in non-critical regions and aggregate them together. 
We propose a Non- Critical Region Updated Pages Set (NCRUPS) scheme for 
grouping pages updated in non-critical regions. In every processor we associate 
every non-critical region with a NCRUPS. The NCRUPSs actually keep the 
knowledge of Regional Locality in non-critical regions. A NCRUPS is formed as 
follows. When a processor enters a non-critical region, a unique empty NCRUPS 
is created and assigned to the non-critical region; when a processor updates 
a page during the execution of a non-critical region, the identifier of the page 
is recorded into the corresponding NCRUPS; when a processor leaves a non- 
critical region, it stops recording into the corresponding NCRUPS but saves the 
NCRUPS for later use. 

By using the NCRUPS scheme, we can group pages updated inside each 
non-critical region and optimally propagate updates of these pages to a processor 
when it is about to access them. We use some hints to decide whether a processor 
is about to access the pages in a NCRUPS so as to propagate all the updates of 
these pages to the processor. The first hint we use is the first page fault on any 
page in a NCRUPS. This hint suggests all the pages in the NCRUPS might be 
accessed soon by the processor according to Regional Locality. Therefore when a 
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fault on a page in a NCRUPS occurs in a processor, we propagate the updates 
of all the pages in the NCRUPS to the processor. 

Based on the above idea, we propose an updates propagation protocol called 
First Hit Updates Propagation (FHUP). This protocol is precisely described as 
follows. 

Protocol 2 The First Hit Updates Propagation protoeol 

For any pair of processors P\ and P2, suppose P\ has left a non-critical region 
and stored a NCRUPS iVi for the non-critical region, and P 2 is the processor 
which enters a non-critical region afterwards. The updates of the pages in N\ 
are propagated as follows: 

1. The invalidation notices of all pages in Ni and Ni itself are propagated 
from Pi to P2 at acquire or barrier accesses according to the Lazy Release 
Consistency model. 

2. When P 2 receives the invalidation notices and Ni, it invalidates the corre- 
sponding pages and stores Ni in its remote NCRUPS list. 

3. During the execution of the non-critical region in P 2 , 'ii& page fault is caused 

by an invalidated page in N \ , the updates of all the pages in Ni are requested 
from Pi and propagated to P2, and Ni is removed from P 2 ’s remote NCRUPS 
list. □ 

The FHUP protocol is a very concise protocol to optimize updates propaga- 
tion in non-critical regions. It can effectively reduce the number of messages in 
some applications. For example, the Integer Sort (IS) application has the regular 
access pattern shown in Fig. 0 IS uses barriers to delimit different computa- 
tion stages. At every stage, every processor has an independent set of working 
(read/write) pages. The processors shift their working pages with each other 
when changing stages. With this access pattern, the FHUP protocol works as 
follows: before the second barrier, the locality groups are formed and kept in the 
NCRUPSs {1,2, 3,4}, {5,6,7,8j, {9,10,11,12}, and {13,14,15,16} in their respec- 
tive processors; at the second barrier in Pi, the invalidation notices of pages in 
the NCRUPSs {5, 6, 7, 8}, {9,10,11,12}, and {13,14,15,16} are propagated to Pi, 
and these NCRUPSs are stored in Pi’s remote NCRUPS list; during the execu- 
tion of the second non-critical region in Pi, once page 5 is accessed. Pi checks 
if page 5 is a member of any NCRUPS in its remote NCRUPS list and finds 
the matched NCRUPS {5, 6, 7, 8}; Pi requests the updates of the pages {5, 6, 7, 8} 
from P 2 , applies these updates on these pages, and then removes the NCRUPS 
{5, 6, 7, 8} from its remote NCRUPS list. For this access pattern, the FHUP pro- 
tocol can reduce page faults by prefetching updates of several to-be-accessed 
pages in a single page fault. 

However, the FHUP protocol may incur useless updates propagation for some 
other access patterns. For example, the Successive Over-Relaxation (SOR) ap- 
plication has the regular access pattern shown in Fig. El SOR uses barriers to 
delimit different computation stages. At every non-critical region. Pi updates 
pages {1,2, 3, 4}, P 2 updates pages {5, 6, 7, 8}, but Pi and P 2 falsely share page 
4 and 5 (False sharing means two processors update different data objects that 
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barrier 






Fig. 3. Memory 


access pattern (1) 





lie in the same page). In the second non-critical region, Pi has the NCRUPS 
{5, 6, 7, 8} in its remote NCRUPS list, and P 2 has the NCRUPS {1,2, 3, 4} in its 
remote NCRUPS list, remote NCRUPS list. When the FHUP protocol is applied, 
the updates of all the pages {1,2, 3,4} are propagated to P 2 and the updates of 
all the pages {5, 6, 7, 8} are propagated to Pi, though P 2 only reads page 4 and Pi 
only reads page 5. Therefore, the FHUP protocol detects the incorrect knowledge 
of Regional Loeality and thus propagates useless updates in SOR. The useless 
updates propagation in FHUP causes performance degradation (see Section 
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barrier 
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Fig. 4. Memory access pattern (2) 



Another problem in the FHUP protocol is that if an invalid page is a mem- 
ber of two or more NCRUPSs in the remote NCRUPS list, the protocol will 
propagate updates of all pages in these NCRUPSs, while some of these pages 
may not be accessed by the processor in the following execution. For example, in 
Fig. Elprocessors shift their working pages as in IS, but their working pages are 
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overlapped because the size of data objects do not align with the size of pages 
(This is also a sort of false sharing). In the second non-critical region, Pi has 
the NCRUPS {4, 5, 6, 7}, {7,8,9,10} and {10,11,12,13} in its remote NCRUPS list. 
Suppose the first page fault is on page 7 in Pi. According to the FHUP proto- 
col, the NCRUPS {4, 5, 6, 7} and {7,8,9,10} will be selected, and Pi will request 
updates of pages {4, 5, 6, 7} and {7,8,9,10} from P2 and P3 respectively, though 
Pi only accesses pages {7,8,9,10} at the second non-critical region. Again FHUP 
detects the incorrect knowledge of Regional Locality and thus propagates useless 
updates for the above access pattern. 



PI 
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P4 


barrier 


barrier 


barrier 


barrier 


1234 


r.w. 4 5 6 7 


r.w. 7 8 9 10 
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r.w. 10 11 12 13 


r.w. 12 3 4 


r.w. 4 5 6 7 


barrier 


barrier 


barrier 


barrier 




Fig. 5. Memory 


access pattern (3) 





From above examples we know that the FHUP protocol does not have suf- 
ficient hints to detect correct knowledge of Regional Locality for false sharing 
access patterns, and may cause useless updates propagation. To overcome this 
drawback, we use both the first and the second page faults on pages in a NCRUPS 
as hints. That is, if a page in a NCRUPS is accessed in a non-critical region by a 
processor, and later another page in the same NCRUPS is accessed in the same 
non-critical region by the same processor, then all the pages in the NCRUPS are 
believed to be very likely accessed by the processor and therefore the updates of 
all the pages in the NCRUPS are propagated to the processor. 

Based on the above idea we propose an updates propagation protocol called 
Second Hit Updates Propagation (SHUP). This protocol is precisely specified as 
follows. 

Protocol 3 The Second Hit Updates Propagation protocol 

For any pair of processors Pi and P 2 , suppose Pi has left a non-critical region 
and stored a NCRUPS Ni for the non-critical region, and P 2 is the processor 
which enters a non-critical region afterwards. The updates of pages in Ni are 
propagated as follows: 

1. The invalidation notices of all pages in Ni and itself are propagated 
from Pi to P 2 at acquire or barrier accesses according to the Lazy Release 
Consistency model. 

2. When P 2 receives the invalidation notices and Ni, it invalidates the corre- 
sponding pages and stores Ni in its remote NCRUPS list. 
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3. During the execution of the non-critical region in page fault is caused 

by an invalidated page which is a member of in P 2 S remote NCRUPS 
list, the condition if iVi is labeled as first-hit in P 2 ’s remote NCRUPS list is 
tested. 

— If Ni is not labeled as first-hit, the updates of the fault page are requested 
from Pi and propagated to P 2 , and the fault page is removed from A^i 
which is then labeled as first-hit in P 2 ’s remote NCRUPS list. 

— If Ni is labeled as first-hit, the updates of the pages in Ni, which have 
not yet sent to P 2 , are requested from Pi and propagated to P 2 , and Ni 
is removed from P 2 ’s remote NCRUPS list. 

4. When P 2 leaves the non-critical region, all the NCRUPSs labeled as first- 

hit in P 2 ^s remote NCRUPS list are reset, and all the empty NCRUPSs are 
removed from the list. □ 

The advantage of the SHUP protocol is that the second page fault is used 
to correctly detect Regional Locality and avoid useless updates propagation. For 
example, for the access pattern in Fig. 2] since there is no second page fault, 
the SHUP protocol only propagates the updates of page 5 to Pi, rather than 
propagates the updates of all the pages {5, 6, 7, 8} to Pi as in the FHUP protocol. 
Also for the access pattern in Fig. 0 the SHUP protocol only propagates the 
updates of the pages {7,8,9,10} to Pi because the second page fault on page 8 
rejects the NCRUPS |4,5,6,7}. 

3 Comparison with Related Work 

There is no updates propagation protocol explicitly exploring Regional Locality. 

A Lazy Hybrid (LH) protocol 0 is proposed based on temporal locality. The 
idea behind the LH protocol is that programs usually have significant temporal 
locality, and therefore any page accessed by a process in the past is likely to 
be accessed in the future. The LH protocol therefore selects updates of pages 
that have been accessed in the past (regardless whether or not in the same 
critical/non-critical region) by the processor acquiring a lock or arriving at a 
barrier, and piggy-backs the updates on grant messages. The similarity between 
LH and our protocols is that both of them use some kinds of locality heuristics to 
prefetch updates of pages. The major difference between LH and our protocols 
is the following: the former uses a heuristic without distinguishing the accessed 
pages which are in the same critical/non-critical region from these pages which 
are not, but the latter makes this distinction based on Regional Locality and 
hence can be more accurate in selecting the updates for prefetch. Since the 
heuristic in the LH protocol is very speculative, it can cause useless updates 
propagation, and thus degrades the performance of the underlying DSM system. 
This point has been verified by our experimental results. 

Other updates propagation protocols based on data prefetching are pro- 
posed They are similar to our First Hit Updates Propagation in the aspect 
that they use the first page fault to trigger the prefetch of pages of a group in 
non-critical regions. But their criteria for grouping pages are different. In Pj 
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page groups have fixed size which is decided by the user or by the system at 
run-time. Accessed pages are filled in sequence into a group until the group is 
full, and then another group is created. The Adaptive-!— I- in ^ fills into a group 
those pages updated between two barriers. But FHUP fills into a group those 
pages updated inside a region. 

No protocol uses the second page fault as our Second Hit Updates Propaga- 
tion to avoid useless updates propagation. 



4 Experimental Results 

All our protocols are implemented in TreadMarks. The Lazy Hybrid protocol is 
also implemented in TreadMarks in order to compare Regional Locality with tem- 
poral locality in DSM. All these protocols are evaluated with the Lazy Updates 
Propagation (LUP) protocol adopted in TreadMarks, which does not explore 
any locality and is a benchmark for those exploring locality. 

The experimental platform consists of 8 SGI workstations running IRIX Re- 
lease 5.3. These workstations are connected by a 10 Mbps Ethernet. Each of 
them has a 100 MHz processor and 32 Mbytes memory. The page size in the 
virtual memory is 4 KB. 

We used 8 applications in the experiment: TSP, BT, QS, Water, FFT, SOR, 
Barnes, IS, among which the source code of TSP, QS, Water, FFT, SOR, 
Barnes, IS are provided by TreadMarks research group. All the programs are 
written in C language. TSP (Travelling Salesman Problem) finds the minimum 
cost path that starts at a designated city, passes through every other city exactly 
once, and returns to the original city. BT (Binary Tree) is an algorithm that cre- 
ates a fixed-depth binary tree. In the algorithm multiple processes explore a 
binary tree to search for unexpanded nodes. If a process finds an unexpanded 
node, it expands the node and creates new unexpanded nodes. The algorithm 
terminates when the fixed-depth binary tree is established. QS (Quick Sort) is a 
recursive sorting algorithm that operates by repeatedly partitioning an unsorted 
input list into a pair of unsorted sublists, such that all of the elements in one of 
the sublists are strictly greater than the elements of the other, and then recur- 
sively invoking itself on the two unsorted sublists. Water is a molecular dynamics 
simulation. Each time-step, the intra- and inter-molecular forces incident on a 
molecule are computed. FFT (3-D Fast Fourier Transform) numerically solves a 
partial differential equation using forward and inverse FFT’s. SOR (Successive 
Over-Relaxation) uses a simple iterative relaxation algorithm. The input is a 
two-dimensional grid. During each iteration, every matrix element is updated to 
a function of the values of neighboring elements. Barnes (Barnes-Hut) simulates 
the evolution of a system of bodies under the influence of gravitational forces. It 
is a classical gravitational N-body simulation, in which every body is modeled 
as a point mass and exerts forces on all other bodies in the system. IS (Integer 
Sort) ranks an unsorted sequence of N keys. The rank of a key in a sequence is 
the index value i that the key would have if the sequence of keys were sorted. 



Exploring Regional Locality in Distributed Shared Memory 153 

All the keys are integers in the range \p,Bmax] and the method used is bucket 
sort. 

Among these applications, TSP and BT only use locks for synchronization, 
and QS uses one lock to protect a task queue. Water uses both locks and bar- 
riers for synchronization, and FFT, SOR, Barnes, and IS only use barriers for 
synchronization. The FHUP and SHUP protocol are not applied to TSP and 
BT since there is no update on shared memory in non-critical regions in these 
two applications. Also since there is no critical regions in FFT, SOR, Barnes, 
and IS, the SLEUP protocol are not applied to them. 

The experimental results are given in Table ^ In the table, the item Time 
is the total running time of an application program; the Total Data is the sum 
of total message data; the Updates Data is the sum of total propagated updates 
data; the Page Fault is the number of page faults; and the Mesgs is the total 
number of messages. 

4.1 Regional Locality 

From the experimental results we know Regional Locality exists in many DSM 
concurrent programs. Among the applications with Regional Locality are TSP, 
BT, QS, Water, Barnes, and IS. By applying SLEUP, FHUP and SHUP, which 
explore Regional Locality, the average improvement on the performance of these 
applications is 20.2% when compared with LUP in original TreadMarks system. 
The maximum improvement is up to 53.8% {TSP). Particularly, by exploring 
Regional Locality, the number of page faults and the number of messages are 
reduced to 46% and 66% respectively in average. There is no improvement on 
the performance of some applications, such as FFT and SOR, because they 
don’t have any Regional Locality. 

4.2 Regional Locality vs. Temporal Locality 

Protocols based on Regional Locality outperform those based on temporal lo- 
cality for all of our applications. Compared with LUP, LH degrades the perfor- 
mance of many programs, such as Water, FFT, SOR, Barnes, IS. (Because 
message buffer overflows at barrier^ we have not provided running results of 
Barnes based on the LH protocol Q). The average degradation is 34.4%, and 
the maximum degradation is up to 108.6% {FFT). The reason for the degra- 
dation is that LH propagates a large amount of useless updates. The average 
amount of useless updates propagated in LH is 27.8% of the total propagated 
updates. Even though LH can improve some applications, such as TSP, BT, 
and QS, but its performance is still not as good as SLEUP/SHUP/FHUP. The 
performance of SLEUP/SHUP/FHUP is 17.7% better than that of LH in av- 
erage. The average amount of updates propagated in SLEUP/SHUP/FHUP is 
29.7% less than that in LH. Even though in some applications, such as FFT, 



^ The buffer overflows because of too much (useless) updates propagation at barriers 
in LH, and therefore its performance will be further degraded at barriers. 
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SOR and IS, the number of page faults and the number of messages in LH are 
less than those in SLEUP/SHUP/FHUP, however, the overall performance of 
SLEUP/SHUP/FHUP is better than that of LH since LH propagates a large 
amount of useless updates. 



application 


protocol 


Time 

(secs) 


Total Data 
(bytes) 


Updates Data 
(bytes) 


Page Fault 


Mesgs 


TSP 


LUP 


15.86 


1267683 


448958 


1029 


2846 


LH 


8.63 


1287368 


463437 


355 


1405 


SLEUP 


7.33 


1252737 


443896 


245 


1209 


BT 


LUP 


82.92 


39511375 


8921228 


26478 


96979 


LH 


72.08 


40964979 


9390072 


13918 


68542 


SLEUP 


69.71 


39148835 


8761972 


6469 


53925 


QS 


LUP 


20.09 


10153006 


6100023 


3046 


10432 


LH 


15.52 


10844953 


6962709 


962 


6095 


SLEUP 


13.36 


9165498 


5354832 


956 


5936 


SLEUP-tFHUP 


14.96 


11596416 


7838800 


829 


5447 


SLEUP-FSHUP 


12.38 


9282800 


5430895 


930 


5886 


Water 


LUP 


32.59 


11717602 


9980061 


4314 


24495 


LH 


36.82 


14535830 


12590288 


2137 


21668 


SLEUP 


31.07 


11834142 


9981561 


3024 


21920 


SLEUP+FHUP 


31.92 


13759521 


11607920 


1733 


18906 


SLEUP+SHUP 


30.63 


12159638 


9979899 


1992 


19764 


FFT 


LUP 


4.44 


3220826 


2188032 


557 


2135 


LH 


9.26 


5540076 


4487644 


174 


1735 


FHUP 


4.87 


3902122 


2820048 


291 


1603 


SHUP 


4.60 


3306240 


2188032 


557 


2136 


SOR 


LUP 


13.70 


7391113 


14140 


203 


4301 


LH 


15.10 


7934204 


473636 


16 


4992 


FHUP 


14.53 


7556048 


134885 


203 


4302 


SHUP 


13.84 


7416629 


14140 


203 


4303 


Barnes 


LUP 


49.38 


50943423 


37198386 


12791 


75943 


LH 


X 


X 


X 


X 


X 


FHUP 


48.14 


55534888 


37687510 


12640 


74318 


SHUP 


49.17 


55136208 


37199430 


12763 


75659 


IS 


LUP 


113.42 


71732008 


69626536 


4444 


11305 


LH 


120.15 


75004402 


73404180 


192 


8044 


FHUP 


108.20 


72100823 


69626536 


2774 


7965 


SHUP 


110.62 


72223052 


69623400 


3998 


10384 



Table 1. Performance Statistics for applications 



From the above discussion we know temporal locality is more speculative than 
Regional Locality. Temporal locality does not have the as accurate knowledge of 
the to-be-accessed data as Regional Locality. This inaccuracy of temporal locality 
causes the useless updates propagation and degrades the performance of DSM 
systems. 
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4.3 Detection of Regional Locality 

We use the CRUPS scheme to detect Regional Locality in critical regions, and use 
the NCRUPS scheme and the first/second page fault to detect Regional Locality 
in non-critical regions. Accuracy of the detected Regional Locality affects the 
performance of the protocols based on Regional Locality. On one hand, incorrect 
knowledge of Regional Locality causes useless updates propagation. For example, 
for FFT and SOR the FHUP protocol detects the incorrect knowledge of Re- 
gional Locality. So FHUP propagates useless updates and degrades performance 
in these two applications. On the other hand, incomplete knowledge of Regional 
Locality hinders the improvement on performance. For instance, SHUP can not 
find complete Regional Locality as immediate as FHUP in LS and Barnes. So 
SHUP does not perform as well as the FHUP for these two applications. 

From the above discussion we know, even though both FHUP and SHUP 
are based on Regional Locality, FHUP is more speculative while SHUP is more 
conservative in terms of detection of Regional Locality. Their merits become 
prominent in different applications. 

The overhead of the CRUPS scheme is very small because it takes advantage 
of the write-protection mechanism provided in the TreadMarks system. There 
are some overhead for bookkeeping the remote NCRUPS list in the NCRUPS 
scheme. For example, for FFT and SOR where there is no Regional Locality, 
SHUP slightly degrades their performance (3.6% degradation for FFT, 1.0% 
degradation for SOR) because of this bookkeeping overhead. 

5 Conclusions 

In this paper, we have discussed the program behavior ~ Regional Locality and 
evaluated this new class of reference locality in updates propagation in DSM 
systems. We have proposed three novel updates propagation protocols, SLEUP, 
FHUP, and SHUP, which explore Regional Locality in DSM systems. The exper- 
imental results indicate: 

1. Regional Locality exists in executions of many Distributed Shared Memory 
concurrent programs. Updates propagation protocols exploring Regional Lo- 
cality significantly improve the performance of the DSM systems. 

2. The protocols based on Regional Locality outperform those based on the more 
speculative temporal locality. Protocols exploring temporal locality causes 
performance degradation for many applications in our experiment. 

Our future research is to explore Regional Locality in other shared memory 
systems. 
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Abstract. In this paper, we emplore the isomorphism between vector 
time and causality to characterize consistency of a set of checkpoints 
in a distributed computing. A necessary and sufficient condition, to de- 
termine if a set of checkpoints can form a consistent global checkpoint, 
is presented and proved using the isomorphic power of vector time and 
causality. To the best of our knowledge, this is the first attempt to use 
the isomorphism for this purpose. This condition leads to a simple and 
straightforward algorithm for a guaranteed mutually consistent global 
checkpointing. In our approach, a process can take a checkpoint when- 
ever and wherever it wants while other related process may be asked to 
take an additional checkpoint for ensuring the mutual consistency. We 
also show how this condition and the resulting algorithm can be used to 
obtain a maximum and minimum global checkpoints, another important 
paradigm for distributed applications. 



1 Introduction 

A large class of important problems in distributed systems can be cast as pe- 
riodically taking a consistent global checkpoints and executing some reactions 
based on the checkpoints that have been taken. Examples of such problems in- 
clude distributed debugging and monitoring, fault-tolerant and rollback-based 
recovery,detection of state properties such as a deadlock and termination. This 
paradigm requires consistently recording (often, periodically recording) the global 
state of a distributed computing. Informally, a global state is a collection (union) 
of the local states, one from each process of the computation, recorded by a pro- 
cess. The saved process state is called a checkpoint. A global state is also called a 
global checkpoint. Such a checkpoint is said to be consistent if it has been passed 
through, or if it could have been passed through, by the current computation 
gnini. In a distributed system based on message-passing without shared mem- 
ory, consistency means if an event of message receive is included in a checkpoint, 
the event of its corresponding message send should also be included. Obviously, 
there is no such distributed computation in which a message has been recorded 
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as being received but which has not been sent in the recording. Thus, a funda- 
mental problem in distributed computing is to ensure that a global checkpoint 
obtained is consistent which is the focus of this paper. 

In this paper, we first present a necessary and sufficient condition to deter- 
mine if a set of local checkpoint can form some consistent global checkpoint. This 
condition use the isomorphism between the vector time and the causal partial 
order. In the literature, there are two published work that describe the necessary 
and sufficient condition HM]. Netzer-Xu’s HD necessary and sufficient condi- 
tion to characterize a consistent global state is expressed in terms of a “zigzag” 
relation defined on the set of local checkpoints. Baldoni et aFs condition is 
based on the precedence relation defined on checkpoint intervals. While our con- 
dition and their conditions are equivalent from a theoretical point of view, our 
condition, derived from the isomorphism between the vector time and causal 
order, leads to a simple and straightforward algorithm that guarantees mutually 
consistent checkpoints as described later in this paper. 

There have been many algorithms for obtaining distributed global check- 
points, for example, see pj for a survey. Generally, there are two approaches to 
checkpointing: 

— Coordinated approach: the processes coordinate their checkpointing actions 
such that the current instance of global checkpoint in the system is guaran- 
teed to be consistent. Such a consistent set of checkpoints can then be used, 
for example, for recovery. When a failure occurs, the system restarts from 
these checkpoints. The disadvantages of this approach are that it requires a 
number of communication messages between processes for each checkpoint- 
ing and introduces synchronization delays during normal operation. 

— Uncoordinated or independent approach: In the second approach each process 
takes checkpoints independently (whenever and wherever it wants) and saves 
them on its stable storage m, and when a consistent global checkpoint is 
required, it has to be constructed from the available set of local checkpoints. 
There is no guarantee that a consistent global checkpoint can actually be 
constructed. In fact, some of the local checkpoints might turn out to be 
useless as they belong to none of consistent global states. This approach, if 
used in a fault-tolerant system based on recovery, would lead to so called 
domino effect H2|. 

The algorithm we proposed in this paper is a combination of the above two 
approaches. Thus it is called semi- coordinated approach, that is, a process can 
independently take its local checkpoints (whenever and wherever it wants), and 
only if doing so results in an inconsistent checkpoint by checking against the nec- 
essary and sufficient condition. The related processes are forced to take additional 
local checkpoints, too. This guarantees the mutually-consistent checkpoints. In 
our algorithm, we use the vector time as an instrument to provide this guarantee. 

The rest of this paper is organized as follows. Section 2 describes a system 
model on which this work is based. In Section 3, we review the vector time mech- 
anism and isomorphism between it and causal precedence, and then derive the 
necessary and sufficient condition for obtaining a consistent global checkpoints. 
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Section 4 presents a vector-based checkpoint algorithm that guarantees mutu- 
ally consistent checkpointing. Section 5 shows how our algorithm can be used to 
construct a maximum and minimum global consistent checkpoints. Finally, we 
conclude the paper in Section 6. 



2 System Model 

A distributed system consists of a finite set of n processes {Pi, ..., P„}, each ex- 
ecuting its own sequential program modeled by a sequence of events. These pro- 
cesses communicate with each other solely by message-passing. We assume that 
the communication channels between processes are FIFO and reliable. Events 
correspond to the state changes that take place in the process. Such a sequence 
of events, denoted by Ei, is called a local history of local computation Pi. The 
collection of local histories of processes participating in a distributed system 
forms the execution history of the system. For the notational purpose, ef de- 
notes the xth event (x > 0) executed by Pi, or simply Ci if the ordinarity is not 
important. 

For our purpose, the events of interest in any distributed system are the 
sending and receiving of message, and internal events. The checkpointing events 
that correspond to the recording of local states by individual processes are the 
only internal events we consider in this paper. We assume that the first event 
in each process is an internal local checkpoint event. Also we do not explicitly 
consider messages in the execution history, rather, they are implied by the pres- 
ence of message send and receive events. The global set of events appearing in 
the execution history cannot be placed naturally in a total order, whereas it is 
possible in the events of a single process. Instead, a partial order on the events 
can be defined using Lamport’s happen-before relation: in the global event set, 
we say that event directly happens before event ej, denoted by Ci Cj, if 

— Ci and Cj are events in the same process and Ci occurs immediately before 

Cj] or 

— Ci is the sending of a message m and Cj is the receiving of m. 

The transitive closure of the relation is the happen-before relation |2I, denoted 
by 

If for two distinct events Ci and ej, Ci Cj and Cj e^, Ci and Cj are said 
to be concurrent, denoted by |j Cj. 

In the system of N processes, a global checkpoint is defined as a set of N 
local checkpoints, one from each process. A consistent global checkpoint is a 
global checkpoint in which no two constituent checkpoints are related by the 
happen-before relation |B|, Formally. 

Definition 1. In a distributed computation of n processes, let CKPT denote a 
global checkpoint by {ckpti,ckpt 2 , . . . ,ckptn}. CKPT is said to be consistent if 
for i, j = 1, ..., n, 



ckpti II ckptj 
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Intuitively, if any two checkpoints are not causally related, i.e. pairwise con- 
current, they clearly are consistent with each other, and can belong to some 
consistent global checkpoint. 

3 Characterizing Consistency of Global Checkpoints 

In this section, we state and prove a necessary and sufficient condition to de- 
termine if an arbitrary set of local checkpoints can form a consistent global 
checkpoint. Our characterization of consistency uses the isomorphic power of 
the structure of vector time and the causality structure of the underlying dis- 
tributed computation. Before we present our condition, we will briefly review 
vector time and its properties, due to Matern UDI. 



3.1 Isomorphism between Vector Time and Causality 

Both Mattern [101 and Fidge |Z] introduced vector time as an operational instru- 
ment to represent causal dependency of events in a distributed computing. Sup- 
pose that a distributed computing consists of the set of processes {pi,P 2 , ■■•,?'«}• 
Each process, pi, has a vector clock, Vi. Events occurring in a process are as- 
signed a vector time, E(e), obtained from the process’ vector clock. This set of 
vector clocks in a distributed system advances by the following rules: 

VTl Vi[l, . . . , n] is initialized to all elements equal 0. 

VT2 1. Whenever an event, Cj, occurs, Vi[i] :=Vi[i] + l. 

2. If the event is a message send event, the reading of Vi is attached as a 
timestamp to the message being sent. 

3. If the event is a message receive event, e^-, corresponding to the message 
send event, Ci, the process updates Vj and assigns its value to Cj by taking 
a pairwise maximum value between Vj and the timestamp of the message, 
that is, Vj{ej) = sup{Vi,Vj), where sup{Vi,Vj) = max{Vi[k],Vj[k]) for 
I < k < n. 

The comparison between vector time is defined using the relations <, >, 
and II . 

Definition 2. For two vector time u, v, 

u = V iff '^i u[i] = V [z] 
u < V iff \/i : u[i] < V [z] 

u < V iff u <v and u v 
u \\ V iff {u v) and (v it. u) 

It has been shown that a computationally very simple relation < defined on 
vector time, i.e. (V, <), characterizes causality |TTij : 
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Theorem 1. For two events e and e' of a distributed computation, we have: 

e^e' iffV{e)<V{e') 
e|| e' iffV{e) || V{e') 

In other words, the structure of vector time is isomorphic to the causality 
structure of the underlying distributed computation. 

In fact we can restrict the comparison to just two vector components in order 
to determine the precise causal relationship between two events (e^ and ej) if 
their origin pi and pj are known. 

Lemma 1. For two distinct events ei and Cj, we have 

e* ^ ej iffV{e^)\i] < V{ej)\i] 

6i II ej iff{V{ei)[i] > V{ej)[i]) and {V{ej)[j] > V{ei)\j]) 

The isomorphism as expressed in Theorem^and Lemma^is essential to our 
characterization of consistency and to the design of our algorithm. It is also a 
key feature to show correctness of this algorithm. 

3.2 Necessary and Sufficient Consistency Condition 

Now we are ready to present a necessary and sufficient condition to determine 
if a set of checkpoints are mutually consistent. 

Theorem 2. Let CKPT = {ckpti\Vi is a vector time assigned to ckpU and 
i = 1 , ..., n} be a set of checkpoint events in a distributed computation, one from 
each process, CKPT forms a consistent global checkpoint if and only if for each 
pair of local checkpoints ckpU and ckptj : 



Vi{ckpti) II Vj(ckptj) 

Proof. Using vector time as an instrument, the theorem immediately follows 
from Theorem ^ and Lemma ^ □ 

4 Algorithm for Guaranteed Mutually Consistent 
Checkpointing 

In our algorithm described in this section, processes take local checkpoints in 
an uncoordinated way, whenever and wherever it wants. While doing so, some 
other related processes may be forced to take additional checkpoint in order 
to guarantee the checkpoints that have been arbitrarily taken, are mutually 
consistent. 

What are the related processes to be forced to take checkpoints? The answer 
is those causally related processes, and is based on the following observation: a 
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local checkpoint that has been taken arbitrarily by a process may not belong to 
some consistent global checkpoints (and thus may be uselessl). The only source 
for this possible inconsistency is the communication events: a local checkpoint 
might have recorded a message receive event but the communicating process 
has not yet recorded the corresponding message send event. In this case, unless 
the communicating process is forced to take an additional checkpoint as well to 
include the send event, the inconsistency results. This observation motivates our 
algorithm. 

Each process maintains a boolean array, recvdi[l, . . . ,n] to record the fact 
that a message receive event has occurred since the last checkpoint. Initially and 
whenever a local checkpoint is being taken, recvedi is set for components to false 
(0). The array component recvdi[j] is set to true{l) if it has received a message 
from process pj since the last checkpoint. Thus recvdi records from whom the 
process pi has received messages since the last checkpoint. These processes are 
potential candidates who will possibly be forced to take an additional checkpoint 
if it has not done so. In fact, the checkpointing process uses this information (in 
recvdi) to inform the partner process by sending a request timestamped with the 
vector time of the checkpoint being taken that the message received has been 
recorded in a local checkpoint and if the partner process has not checkpointed 
the send event, please do so. Checkpointing by the partner process may trigger 
a further request to yet another process for (possibly) checkpointing. In order to 
properly terminate this checkpointing process, the process, while checkpointing, 
will stop sending any (new) application message and wait for acknowledgment. 

Since each process independently takes its local checkpoints, it is possible that 
when a process records its local state including a message receive, the sending 
process had already checkpointed the send event. In this case, there is no need 
for the sending process to take an additional checkpoint. The sending process 
simply acknowledges with a vector time of the latest checkpoint that includes 
the send event. We call the checkpoint being taken by pt and the checkpoint in pj 
whose vector time is been returned as the corresponding checkpoints. According 
to Lemmain by comparing the vector time of the most recent checkpoint with the 
timestamped vector time of the request message, it is easy for the sending process 
to find out whether the additional checkpoint needs to be taken. In addition, each 
process pi maintains a set of pair of the index and vector time of the checkpoints. 
Si = {{l,Vi)\l € N}, which can be used for constructing a maximum/minimum 
consistent global checkpoint as described in the next section. 

The algorithm is formally described as the following procedure or rules: 

Process pi recordJocaLstate 

— As an internal event, record local state: 

- VS] = V\i] + 1; 

- 5 , = ^, + {(+ + ?, 14 )} 

— recvd = 0; 
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Process pi checkpointing rule 

— stop sending any new application message; 

— if for all A: yf i : {1, . . . , n}, recvdi[k] = 0 reeordJoealstate. 
else for all fc yf z : {1, . . . , n}, recvdi[k] yf 0 

• recordJocaLstate. 

• send request-f or _checkpointing{Vi{ckpti))] 

• wait for ack messaage: done{Vk{ckptk))] 

— resume sending application message (if any). 

Process pj responding rule to request-f or _checkpointing{Vi{ckpti)) 

— if {Vj[j\ of the last checkpoint of process ) < Vi{ckpti)[j] 

execute checkpointing rule; (* the send event not yet checkpointed*) 

— acknowledge: done{Vj{ckptj)). 

It is essential to note that since a checkpoint is an internal event within a 
process, every time a checkpoint is taken by pi, the zth component of its vector 
time increases by 1, making its vector time incomparable to the vector time of 
events in other process from which pi has received messages. 

Lemma 2. For two corresponding checkpoints, ckpti and ckptj caused by ckpti, 
we have 

Vi{ckpti) II Vj{ckptj) 

Proof. From the advance rules of vector time, after checkpointing, Vi{ckpti)[i] > 
Vj{ckptj)[i], and similarly, Vj{ckptj)[j] > Vi{ckpti[j], According to Lemma ^ 
Vi{ckpU) II Vj(ckptj) □ 

4.1 Example 

We further explain the algorithm using an example (Fig. 1). The example dis- 
tributed computation has three processes {pi,P 2 ,P 3 } with initial checkpoints 
ckpt\, ckpt\, and ckpt\. At some point, after p 2 received message ml, it decides 
to take a checkpoint ckpt^ and finds that recvd = [100] suggesting that it has 
received a message (ml) since the last checkpointing {ckpt\), so it sends out a 
request timestamped with its vector time (now [230]). After receiving this re- 
quest and from the incoming vector time stamp, it realizes a need to take a 
checkpoint as well (because Vi{ckpt\)[l] < V 2 {ckpt\)[l]), and so takes the check- 
point which is ckpt\ with its vector time [300] . From two vector times ( [300] and 
[230]), they are concurrent. Later, p\, after sending the message m3, its wants 
to take a checkpoint, since recvd = [000], no other process is involved in its 
checkpointing. It is easy to check that {cfcptf , cfcptg} forms a consistent 

global checkpoint. 

The checkpoint ckpt^ demonstrates a more complex scenario. After receiving 
a message mb, ps wants to checkpoint its local state, since recvd^ = [010], 
it sends a request timestamped with a vector time [473] to p 2 to ensure the 
consistency. After checking, p 2 needs to checkpoint, too. While doing so, p 2 
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requests pi with the timestamp [480] {recvd .2 = [100]). However, there is no need 
for Pi to take a checkpoint because V\{ckpt\)[l] > V 2 {ckpt\[V\. As a result, ps has 
taken a checkpoint ckpt^ as it wished and p 2 has also taken a checkpoint ckpt^ 
as requested to ensure the consistency. Clearly, Checkpoints {cfeptf, ckpt^^ ckpt\} 
form a consistent global checkpoint. 



(Vi[i] = i)< (V2[i] = 2) recvd =000 



200 300 400 500 660 recvd = 010 (V1[1]-5)>(V2[l]-4) 




Fig. 1. An example distributed computation. 



This example clearly shows that the algorithm provides a guaranteed mutu- 
ally consistency between the checkpoints. Below we give a formal proof of this 
claim. 



4.2 Correctness of the Algorithm 

The key to the correctness of the algorithm is to properly advance the vector time 
of checkpoints. Note that in the algorithm this is achieved by (1) the checkpoint 
event as an internal event increments its own component of the vector time 
(Vi[i\ = Vi[i] + 1), (2) the request for checkpointing per se does not advance 
the vector time. In essence, forcing a process to take an additional checkpoint 
is forcing the vector time to advance in such a way that two checkpoints are 
concurrent. 

Lemma 3. Let ckpti be a checkpoint of process pi, there always is a checkpoint 
in another process, pj, of the computation, such that ckpti j| ckptj. 

Proof. Immediately from Lemma El n 



Theorem 3. In a distributed computation of n processes, let CKPT denote a 
set of checkpoints taken by the algorithm, one from each process, denoted by 
{ckpti\i = 1, . . . ,n} such that Vi(ckpti) jj Vj{ckptj), then the set CKPT forms a 
consistent global checkpoint. 
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Proof. From Lemma|2l and LemmaEl the CKPT is a set of checkpoints that are 
pairwise concurrent. According to Definition Q], the set form a consistent global 
checkpoint. □ 

Corollary 1. No checkpoint taken by the algorithm is useless. 

4.3 Discussion 

Limited Coordination. In our algorithm, while processes enjoy independence 
in taking local checkpoints, they have an obligation to take additional check- 
points. This obligation is coercive, necessary for the consistency guarantee. Be- 
cause of this obligation, we call it semi- coordinated approach. A question nat- 
urally arised here is to what extent a process would coercively undertake this 
obligation. To one extreme, if a checkpoint taken by one process would trigger 
all other processes to take checkpoints as well, this becomes a fully coordinated 
approach. Fortunately, in many distributed applications, and in most cases, the 
involvement of other processes in checkpointing is very limited in terms of scope 
and in terms of numbers of additional checkpoints. The scope is limited because 
only very few processes are likely to be involved. The number of additional check- 
points is limited because when a request comes in, it is likely that the expected 
checkpoints had already been taken. Our claim in this regards is supported by 
the following observation cn]: 

Observation 1. In a distributed computation, even if the number n of processes 
is large, only few of them are likely to interaet frequently by direct message ex- 
change. 

This observation reveals that distributed computing typically exhibits the 
nature of communication locality. Our approach explicitly explores the locality 
of distributed computing. 

Inhibition. As can be noted (for example in Fig. 1), during checkpointing, 
the underlying application is delayed to ensure that the necessary and sufficient 
condition is satisfied. This delay (also called inhibition) is perhaps the price to be 
paid for the guaranteed consistency. For the discussion of the role and spectrum 
of inhibition in asynchronous consistent-cut protocols, see HHEI. 

Event Analysis Based on Vector Time. As pointed out earlier, there have 
been other published work giving the necessary and sufficient condition m 
E). Netzer et al derive their condition based on zigzag notion by analysing the 
interaction pattern in distributed computations. Baldoni et al base their work 
on the notion of checkpoint intervals and analyze the relationship between the 
intervals. Vector time was not used in their analysis. We have also based our 
work on event analysis and used vector time as a tool in the derivation. There 
exists a recent development in the theoretical framework based on logical vector 
time in which several meaningful timestamps of abstract events are derived [3| . 
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5 Constructing Maximum and Minimum Consistent 
Global Checkpoints 

In this section, we demonstrate how the technique developed in the preceding 
sections can be used to find the maximum and minimum consistent global check- 
points that contain a target checkpoint in one process. The concept of the max- 
imum and minimum consistent global checkpoints is very important for many 
distributed applications HS|. For example, for software error recovery, a target 
checkpoint may be suggested by a diagnosis procedure to maximize the chance 
of bypassing the software bug that caused error; for debugging applications, the 
target checkpoint can be defined by user-specified breakpoint; yet another ex- 
ample is in the context of mobile distributed computing, periodic checkpoints 
of a process running on a mobile host may be stored in different locations, as 
the mobile host moves between cells EDI, and the target checkpoint may be 
the available or easily accessible checkpoints. In rollback-based failure recovery, 
the maximum global checkpoint is the most recent consistent global checkpoint 
(called the recovery line), whereas the minimum global checkpoint corresponds 
to the notion of “move forward only if absolutely necessary during a normal 
execution” or “undo as much as possible during a rollback” m 

The maximum and minimum consistent global checkpoints are formally de- 
fined below, which are based on partially ordered relations. 

Definition 3. Given a target checkpoint ckpti of process pi, let G denote the 
consistent global checkpoint, i.e. a set of checkpoints, each from a different process 
of the computation, 

1) . G is the maximum consistent global checkpoint containing ckpti if and only 

if for all ckptj G G, Vj{ckptj)[j] is the largest in process pj such that ckptj || 
ckpti . 

2) . G is the minimum consistent global checkpoint containing ckpti if and only 

if for all ckptj € G, Vj{ckptj)[j] is the smallest in process pj such that 
ckptj II ckpti. 

The algorithms for constructing the maximum and minimum consistent global 
checkpoints is straightforward after periodically checkpointing the application. 
Recall that during checkpointing, the process pi has already maintained a set, 
Sj, of pair {I, Vj{ckptj)), i.e. Sj = {(Z, Vj{ckptj)\l = 1, . . . , last} where last is the 
index for the most recent checkpoint. The construction of maximum consistent 
global checkpoint proceeds as follows: 

1. Given a target checkpoint ckpti, let G be a global checkpoint, then G = 
G + {ckpti}. 

2. For all j yf z and j = 1, ... ,n search Sj from the checkpoint index last for the 
first checkpoint ckptj such that Vj(ckptj) || Vi{ckpti), then G = G + {ckptj} 

3. After the search and comparison finish, G is the maximum consistent global 
checkpoint containing the target checkpoint ckpU oi pi. 



Guaranteed Mutually Consistent Checkpointing 167 



Similarly, if the search and comparison starts from reversing direction, i.e. 
from I — 1 rather than from I = last, the set G obtained is the minimum 
consistent global checkpoint containing the target checkpoint ckpti oi pi. 

6 Conclusion 

Consistent global checkpoint is an important paradigm that can be found in 
many distributed application and was studied extensively |0|. This paper’s con- 
tributions are : 

— First, we presented a necessary and sufficient condition to characterize the 
consistency. Our characterization differs from the two previous work in that it 
employs the power of isomorphism between vector time and causality, which 
makes it simple and straightforward to derive the algorithm for guaranteed 
mutually consistent global checkpoint. 

— Second, the algorithm has adopted a semi-coordinated approach towards 
checkpointing, a combination of coordinated checkpointing and uncoordinated 
independent checkpointing, which provides a guaranteed consistency for any 
checkpoint taken while keeping the coordination to the minimum. Further, 
our algorithm is vector-time based, this simple mechanism ensures the cor- 
rectness of the algorithm and its implementation. 

— Third, we have demonstrated how our algorithm can be used to find a max- 
imum and minimum consistent global checkpoint, a very useful notion in 
many applications. 

Since vector clock is a well-known mechanism and has been implemented in 
many distributed algorithms nnin!, our work can be easily adapted and used 
in these systems with a little extra effort and cost. 
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Abstract. We present a constraint system OF of feature trees that 
is appropriate to specify and implement type inference for first-class 
messages. OF extends traditional systems of feature constraints by a 
selection constraint x{y)z “by first-class feature tree” y, in contrast 
to the standard selection constraint x[f]y “by fixed feature” /. We 
investigate the satisfiability problem of OF and show that it can be 
solved in polynomial time, and even in quadratic time in an important 
special case. We compare OF with Treinen’s constraint system EF of 
feature constraints with first-class features, which has an NP-complete 
satisfiability problem. This comparison yields that the satisfiability 
problem for OF with negation is NP-hard. Based on OF we give a 
simple account of type inference for first-class messages in the spirit of 
Nishimura’s recent proposal, and we show that it has polynomial time 
complexity: We also highlight an immediate extension that is desirable 
but makes type inference NP-hard. 

Keywords: object-oriented programming; first-class messages; 

constraint-based type inference; complexity; feature constraints 



1 Introduction 

First-class messages add extra expressiveness to object-oriented programming. 
First-class messages are analogous to first-class functions in functional program- 
ming languages; a message refers to the computation triggered by the corre- 
sponding method call, while a functional argument represents the computation 
executed on application. For example, a map method can be defined by means 
of first-class messages as follows 

method map(o,l) = for each message m in I: o ^ m 

where o is an object, I is a list of first-class messages, and o^m sends message 
m to o. 

First-class messages are more common and crucial in distributed object- 
oriented programming. A typical use of first-class messages is the delegation 
of messages to other objects for execution. Such delegate objects are ubiquitous 
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in distributed systems: for example, proxy servers enable access to external ser- 
vices (e. g., ftp) beyond a firewall. The following delegate object defines simple 
proxy server: 

let ProxyServer = { new(o) = { send(m) = o ^ m} }; 

This creates an object ProxyServer with a method new that receives an object o. 
The method returns a second object that, on receipt of a message labeled send 
and carrying a message m, forwards m to o. To create a proxy to an FTP server, 
we can execute 

let FtpProxy = ProxyServer new(ftp); 

where ftp refers to an FTP object. A typical use of this new proxy is the following 
one: 



FtpProxy send(get( ’ paper . ps . gz ’ )) 



Delegation cannot be easily expressed without first-class messages, since the 
requested messages are not known statically and must be abstracted over by a 
variable m. 

In a programming language with records, abstraction over messages corre- 
sponds to abstraction over field names: For example, one might want to use a 
function let fn a; = y.x\ to select the field x from record y. Neither first-class 
messages nor first-class record fields can be type checked in languages from the 
ML family such as SML [I^ or the objective ML dialect O’Caml |23j. 

Recently, the second author has proposed an extension to the ML type system 
that can deal with first-class messages HHJ. He defines a type inference proce- 
dure in terms of kinded unification m and proves it correct. This procedure is, 
however, formally involved and not easily understandable or suitable for further 
analysis. 

In this paper, we give a constraint-based formulation of type inference for 
first-class messages in the spirit of m that considerably simplifies the original 
formulation, and we settle its complexity. For this purpose, we define a new 
constraint system over feature trees [5| that we call OF (objects and features). 
This constraint system extends known systems of feature constraints [5ldOI27B| 
by a new tailor-made constraint: this new constraint is motivated by the type 
inference of a message sending statement o <— m, and pinpoints the key design 
idea underlying Nishimura’s system. 

We investigate the (incremental) satisfiability problem for OF and show that 
it can be solved in polynomial time, and in time 0{nf) for an important spe- 
cial case. We also show that the satisfiability problem for positive and negative 
OF constraints is NP-hard, by comparing OF with Treinen’s feature constraint 
system EF pTlTj . 

Based on OF, we define monomorphic type inference for first-class messages. 
Our formulation considerably simplifies the original one based on kinded uni- 
fication. A key difference between both is that we strictly separate the types 
(semantics) from the type descriptions (syntax), whereas the original system 
confused syntax and semantics by allowing variables in the types themselves. 
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From our complexity analysis of OF we obtain that monomorphic type in- 
ference for first-class messages can be done in polynomial time. Incrementality 
is important for modular program analysis without loss of efficiency in compar- 
ison to global program analysis. Our constraint-based setup of type inference 
allows us to explain ML-style polymorphic type inference [TTini as an instance 
HM(OF) of the HM(X) scheme [53 : Given a monomorphic type system based 
on constraint system X, the authors give a generic construction of HM(X), i. e., 
type inference for ML-style polymorphic constrained types. Type inference for 
the polymorphic system remains DEXPTIME-complete, of course Ca- 
in the remainder of the introduction we summarize the main idea of the type 
system for first-class messages and of the constraint system OF. 

1.1 The Type System 

The type system contains types for objects and messages and explains what 
type of messages can be sent to a given object type. An object type is a labeled 
collection of method types {i. e., a product of function types distinguished by 
labels) marked by obj. E. g., the object 

let o = { pos(x) = x>0, neg(p) = ^ p} 

implements two methods pos and neg that behave like functions from integer 
and boolean to booleaii, respectively. Hence, it has an object type obj(pos:int ^ 
bool, neg:bool ^ booljjj When a message f{M) is sent to an object, the corre- 
sponding method is selected according to the message label / and then applied 
to the message argument M . Since a message parameter may refer to a variety 
of specific messages at run-time, it has a message type marked by msg that col- 
lects the corresponding types (as a sum of types distinguished by labels). For 
example, the expression 

m = if b then pos(42) else neg(true); 

defines, depending on b, a message m of message type msg(pos:int, neg:bool). 
The expression o -s— m is well-typed since two conditions hold: 

1. For both labels that are possible for m, pos and neg, the object o implements 
a method that accepts the corresponding message arguments of type int or 
bool. 

2. Both methods pos and neg have the same return type, here bool. Thus the 
type of o <— m is unique even though the message type is underspecified. 

These are the crucial intuitions underlying Nishimura’s type system jlSj. Our 
type inferences captures these intuitions fully. Formally, however, our type in- 
ference implements a type system that does not exactly match the original one: 

^ Notice that the colons in the type obj(pos:int — > bool, neg:bool — > bool) do not 
separate items from the annotation of their types, but rather the field names from 
the associated type components. This notation is common in the literature on feature 
trees and record typing. 



172 



Martin Muller and Susumu Nishimura 



Ours is slightly weaker and hence accepts more programs than Nishimura’s. This 
weakness is crucial in order to achieve polynomial time complexity of type in- 
ference. However, type inference for a stronger system that fills this gap would 
require both positive and negative OF constraints and thus make type inference 
NP-hard. 



1.2 Constraint-Based Type Inference 



paper 



conf 




year 



1998 



It is well-known that many type inference problems have a natural and sim- 
ple formulation as the satisfiability problem of an appropriate constraint system 
(e. g. ^2I!). Constraints were also instrumental in generalizing the ML-type 
system towards record polymorphism overloading [mnj and subtyp- 

ing jllSj (see also p!7|l. 

Along this line, we adopt feature trees P] as the semantic domain of the 
constraint system underlying our type system. A feature tree is a possibly infi- 
nite tree with unordered marked edges (called features) and with marked nodes 
(called labels), where the features at the same node must be pairwise different. 
For example, the picture on the right shows a feature 
tree with two features conf and year that is labeled with 
paper at the root and asian resp. 1998 at the leaves. 

Feature trees can naturally model objects, records, 
and messages as compound data types with labeled 
components. A base type like int is a feature tree with 

label int and no features. A message type msg(/i:ri, . . . , is a feature tree 

with label msg, features {/i, . . . , /„}, and corresponding subtrees {n, . . . , r„}, 
and an object type obj(/i:ri ^ , . . . , /„:r„ ^ r(,) is a feature tree with label 

obj, features {/i, . . . , /„}, and corresponding subtrees ti ^ through Tn ^ 
the arrow notation r — > r' in turn is a notational convention for a feature tree 
with label — > and subtrees r, t' at fixed and distinct features d and r, the names 
of which should remind of “domain” and “range” . 

Feature trees are the interpretation domain for a class of constraint languages 
called feature constraints |5i;fll27l4libj . These are a class of feature description 
logics, and, as such, have a long tradition in knowledge representation and in 
computational linguistics and constraint-based grammars . More recently, 

they have been used to model record structures in constraint programming lan- 
guages m3m- 

The constraint language of our system OF is this one: 



(fi ::= ipA(fi' \ x = y \ a(x) | x[f]y \ F{x) \ x{y)z 



The first three constraints are the usual ones: The symbol = denotes equality 
on feature trees, a{x) holds if x denotes a feature tree that is labeled with a 
at the root, and x[f]y holds if the subtree of (the denotation of) x at feature 
/ is defined and equal to y. For a set of features F, the constraint F{x) holds 
if X has at most the features in F at the root; in contrast, the arity constraint 
of CFT forces x to have exactly the features in F. The constraint x{y)z is 
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new. It holds for three feature trees Ty, and Tj if (f) Tx has more features at 
the root than Ty, and if {ii) for all root features / at Ty, the subtree of Tx at / 
equals Ty.f Tz (where Ty.f is the subtree of Ty at /). 

It is not difficult to see that x{y)z is tailored to type inference of message 
sending. For example the ProxyServer above gets the following polymorphic con- 
strained type: 

Va/Jy .obj(o!) A msg(/3) A a{(3)^ {new:a — ^ {send: P ^ 7 }} 

Using notation from 1221 , this describes an object that accepts a message labeled 
new with argument type a, returning an object that accepts a message labeled 
send with argument type (3 and has return type 7 ; the type expresses the addi- 
tional constraint that a be an object type, /3 be a message type appropriate for 
a, and the corresponding method type in a has return type 7 . 

Plan. Section 121 defines the constraint system OF, considers the complexity of 
its satisfiability problem, and compares OF with the feature constraint systems 
from the literature. Section 0 applies OF to recast the type inference for first- 
class messages and compares it with the original system Section 0 concludes 

the paper. 

Some of the proofs in this paper are only sketched for lack of space. The 
complete proofs are found in an appendix of the full paper ini- 

2 The Constraint System OF 

2.1 Syntax and Semantics 

The constraint system OF is defined as a class of constraints along with their 
interpretation over feature trees. We assume two infinite sets V of variables 
X, y,z , . . ., and T of features /,..., where T contains at least d and r, and a set 
£ of labels a,b, . . . that contains at least — The meaning of constraints depends 
on this label. We write x for a sequence x\, . . . ,Xn of variables whose length n 
does not matter, and xvy for a sequence of pairs xi'.yi , . . . , Xn-y-a- We use similar 
notation for other syntactic categories. 

Feature Trees. A path tt is a word over features. The empty path is denoted by e 
and the free-monoid concatenation of paths tt and tt' as tttt'; we have ett = tte = 
TT. Given paths tt and tt', tt' is called a prefix of tt ii tt = tt'tt" for some path tt" . 
A tree domain is a non-empty prefix closed set of paths. A feature tree t is a pair 
{D, L) consisting of a tree domain D and a labeling function L \ D ^ C. Given a 
feature tree r, we write Dx for its tree domain and for its labeling function. 
The arity ar(r) of a feature tree r is defined by ar(r) = D J-. li tt G we 
write as t.tt the subtree of r at path tt: formally Dx.tt = {tt' \ tttt' € Dt} and 
Lx.x- = {(7^^n) I (tttt', a) G L^}. A feature tree is finite if its tree domain is 
finite, and infinite otherwise. The cardinality of a set S is denoted by ffS. Given 
feature trees ti, . . . , t„, distinct features /i, . . . , /„, and a label a, we write as 
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a{fi'Ti, . . . , fn'Tn) the feature tree whose domain is Ur=i{/*^ I ^ ^ and 
whose labeling is {(£, a)} U lj”^^{(/i7r, 6) | (tt, 6) G L^}- We use ri ^ T 2 to 
denote the feature tree t with L-r = (e, ^), ar(r) = {d, r}, T.d = ti, and r.r = T 2 - 

Syntax. An OF constraint tp is defined as a conjunction of the following prim- 
itive constraints: 



x = y 


(Equality) 


a{x) 


(Labeling) 


x[f]v 


(Selection) 


F{x) 


(Arity Bound) 


x{y)z 


(Object Selection) 



Conjunction is denoted by A. We write ip' ^ (p ii all primitive constraints in p' 
are also contained in p, and we write x = y G p [etc.] if a: = y is a primitive 
constraint in p [etc.]. We denote with F{p), L{p), and V{p) the set of features, 
labels, and variables occurring in a constraint p. The size S{p) of a constraint p 
is the number of variable, feature, and label symbols in p. 



Semantics. We interpret OF constraints in the structure TT of feature trees. 
The signature of TF contains the symbol =, the ternary relation symbol •(•)•, 
for every a £ C & unary relation symbol a(-), and for every / G IF a binary 
relation symbol •[/]•. We interpret = as equality on feature trees and the other 
relation symbols as follows. 

a(r) if (e, a) € Lt 

r[/]r' if T.f = t' 

F{t) if ar(r) C F 

if V/ G ar(r') : / G ar(r) and T.f = r'.f t" 

Let <F and be first-order formulas built from OF constraints with the usual 
first-order connectives V, A, etc., and quantifiers. We call <P satisfiable 

(valid) if F is satisfiable (valid) in FT. We say that F entails F' , written F 
F' , ii F ^ F' is valid, and that F is equivalent to F' ii F F' is valid. 

A key difference between the selection constraints x[f]y and x{y)z is that 
“selection by (fixed) feature” is functional, while “selection by (first-class) feature 
tree” is not: 



x[f]yFx[f]y'^y = y' (1) 

x(y)z A x(y)z' z = z! (2) 

The reason for the second equation not to hold is that y may have no subtrees: In 
this case, the constraint x{y)z does not constrain z at all. L e., this implication 
holds: 



{}(y) 



Vz x{y)z 



(3) 
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If, however, y is known to have at least one feature at the root, then selecting 
both z and z' by y from x implies equality of z and z': 

y[f]y' A x{y)z A x{y)z' z = z' (4) 

OF cannot express that y has a non-empty arity; rather, to express that y has 
some feature it must provide a concrete witness. Using negation, this can be ex- 
pressed as -'{}(a:). However, while satisfiability for OF is polynomial, it becomes 
NP-hard if it is extended such that ^{}(a;) can be expressed (see Section ^3). 

Feature Terms. For convenience, we will occasionally use feature terms 0 
as a generalization of first-order terms: Feature terms t are built from variables 
by feature tree construction like a(/i:ti, . . . , where again the features 
fi, ... fn are required to be pairwise distinct. Equations between feature terms 
can be straightforwardly expressed as a conjunction of OF constraints x = y, 
a(x), F(x), x[f]y, and existential quantification. For example, the equation x = 
a(f-.b) corresponds to the formula (a(x) A {fj(x) A x[f]y A b{y) A {}(y)). In 
analogy to the notation ri ^ T 2 , we use the abbreviation x = y z for the 
equation x = ^{d\y, r:z). 



2.2 Constraint Solving 

Theorem 1. The satisfiability problem of OF constraints is decidable in incre- 
mental polynomial space and time. 

For the proof, we define constraint simplification as a rewriting system on con- 
straints in Figure ID The theorem follows from Propositions d El and El below. 
Rules (Substitution), (Selection), (Label Clash), and (Arity Clash) are standard. 
Rules (Arity Propagation I/II) reflect the fact that a constraint x{y)z implies 
the arity bound on x to subsume the one on y. (Arity Intersection) normalizes a 
constraint to contain at most one arity bound per variable. (Object Selection I) 
reflects that x{y)z implies all features necessary for y to be also necessary for x, 
and (Object Selection II) establishes the relation of x, y, and z at a joint feature 
/• 

Notice that the number of fresh variables introduced in rule (Object Selection 
I) is bounded: This rule adds at most one fresh variable per constraint x{y)z and 
feature / and the number of both is constant during constraint simplification. 
For the subsequent analysis, it is convenient to think of the fresh variables as 
fixed in advance. Hence, we define the finite set : V'{(p) =def 0 {vxj S V | 
X e V{ip),f G F{ip),v^j fresh}. 

Remark 1. In addition to the rules in Figure d there are two additional rules 
justified by implications and ®: 



A x{y)z 
T 

(p A x(y)z A x{y)z' 



if {}(?/) e T 



(Empty Message) 



(f A x{y)z f\ z = z' 



if y[f]y' e T 



(Non-empty Message) 
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(p /\x = y 



p[y/x] Ax = y 


if X G fv{p) 


(Substitution) 


p A x[f]y A x[f]z 
p A x[f]z Ay = z 




(Selection) 


p A x{y)z A F{x) 
p A x{y)z A F{x) A F{y) 


if not exists F' : F'{y) £ p 


(Arity Propagation I) 


p A x(y)z A F{x) A F'{y) 


if F n F' ^ F' 


(Arity Propagation II) 


p A x{y)z A F{x) A F n F'{y) 


p A F{x) A F'(x) 
pAFnF'(x) 




(Arity Intersection) 


V 

pAx[f]x' 


if x(y)zAy[f]y G and 

not exists 2: : x[f]z G 9?, x' fresh 


(Object Selection I) 


V 

p Ax' = y' ^ z 


if x{y)z A y[f]y' A x[f]x' € p and 
x' = y' z ^ 


(Object Selection II) 


p A a(x) A b{x) 
fail 


if a ^ b 


(Label Clash) 


p A F{x) A x[f]x' 


iff ^F 


(Arity Clash) 



fail 



Fig. 1. Constraint Solving Rules 

The first one is just a simplification rule that does not have an impact on the 
satisfiability check. It helps reducing the size of a solved constraint and therefore 
saves space and time. Secondly, compact presentation of a solved constraint can 
be crucial in the type inference application where solved constraints must be 
understood by programmers. The second one is a derived rule that should be 
given priority over rule (Object Selection II). 

□ 

Proposition 1. The rewrite system in Figure Q terminates on all OF eon- 
straints tp. 

Proof. Let (p be an arbitrary constraint. Obviously, F{p) is a finite set and the 
number of occurring features is fixed since no rule adds new feature symbols. 
Secondly, recall that the number of fresh variables introduced in rule (Object 
Selection I) is bounded. Call a variable x eliminated in a constraint x = y f\p li 
X ^ V{p). We use the constraint measure {0i,02,A,E,S) defined by 

(Oi) number of sextuples {x,y, z, x' ,y' , f) of non-eliminated variables 
X, y, z, x', y' G V'{p) and features / G F{p) such that x{y)zAx[f]x' Ay[f]y' G 
p but x' = y' ^ z ^ p. 
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(O 2 ) number of tuples {x, /) of non-eliminated variables x G V'{(p) and features 
/ G F{if) such that there exists y,y' and z with x{y)z A y[f]y' G (p but 
x[f]x' ^ if for any x'. 

(A) number of non-eliminated variables x G V'{ip) for which no arity bound 
F{x) G <p exists. 

(E) number of non-eliminated variables. 

(S) size of constraint as defined above. 

The measure of p is bounded and strictly decreased by every rule application as 
the following table shows, this proves our claim. 





Oi 


O 2 


A 


E 


S 


(Arity Propagation II) 


= 


= 


= 


= 


< 


(Arity Intersection) 


= 


= 


= 


= 


< 


(Selection) 


= 


= 


= 


= 


< 


(Substitution) 


< 


< 


< 


< 


= 


(Arity Propagation I) 


= 


= 


< 


= 


> 


(Object Selection I) 


= 


< 


> 


> 


> 


(Object Selection II) 


< 


= 


= 


= 


> 



Proposition 2. We can implement the rewrite system in Figure^ such that it 
uses space O(n^) and incremental time 0{n^), or, if the number of features is 
bounded, such that it uses linear space and incremental time O(n^). 

Proof. We implement the constraint solver as a rewriting on pairs (P, S) where 
S is the store that flags failure or represents a satisflable constraint in a solved 
form, and where P is the pool (multiset) of primitive constraints that still must 
be added to S. To decide satisfiability of p we start the rewriting on the pool 
of primitive constraints in p and the empty store and check the failure flag on 
termination. 

For lack of space, we defer some involved parts of the proof to the full pa- 
per |T7| . 

Define nt = #V{p), Uy = ni ■ nj = #V'{p), ni = ffL{p), Uf = ffF{p). In 
the full paper, we define a data structure for the store that consists of a union- 
find data structure m for equations, tables for the constraints a{x), F{x), and 
x[f]z, a list for constraints x{y)z, and two adjacency list representations of the 
graphs whose nodes are the initial variables, and whose edges {x,y) are given 
by the constraints x{y)z for the first one and y{x)z for the second one. (See 
appendix of the full paper HD for details). This data structure has size 

0{ui ■ Uf + Ui + Uy ■ Uf + Hi ■ nf + n) = 0{uy ■ np + n) 

which is 0{n) if the number of features is assumed constant and O(n^) otherwise. 
It also allows to check in time 0(1) whether it contains a given primitive con- 
straint; and to add it in quasi-constant time0This is clear in the non-incremental 

^ All constraints except equations can be added in time 0(1), but addition of an 
equation costs amortized time 0(a(ny)) where a(uy) is the inverse of Ackermann’s 
function. For all practical purposes, a(ny) can be considered as constant. 
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(off-line) case where Uy, nt, n/,and rig are fixed. In the incremental (on-line) case, 
where Uy, rii, Uf, and Us may grow, we can use dynamically extensible hash ta- 
bles 0 to retain constant time check and update for primitive constraints. 

Each step of the algorithm removes a primitive constraint from the pool P, 
adds it to the store S, and then derives all its immediate consequences under 
the simplification rules: Amongst them, equations x = y and selections x[f]y are 
put back into the pool, while selections x{y)z and arity bounds F{x) are directly 
added to the store. 

We show that every step can be implemented such that it costs time 0{n+rii- 
n/).0 (The complete analysis of this time bound can be found in appendix of the 
full paper [HI ). We also show that every step may at most add 0{n) equations 
and 0(uy) selection constraints of the form x[f]y. It remains to estimate the 
number of steps: There are at least 0(n) steps needed for touching all primitive 
constraints in tp. 

— Amongst the new equations, there are at most 0(riy) relevant ones, in the 
sense that one can at most execute riy equations before all variables are 
equated. That is, all but 0(ny) equations cost constant time. 

— Amongst the new selection constraint, there are at most 0(n„ • n/} relevant 
ones since adding a selection constraint x[f]y induces immediate work only 
if X has no selection constraint on / yet. The others will generate a new 
equation and terminate then. Hence, all but 0{ny ■ nj) selection constraints 
cost constant time. 

In summary, there are 0(n + ny -Uf) steps that cost 0{n + rii-nf). Each of these 
steps may add 0{n) equations and 0{ny) selections each of which may add a 
new equation itself. Hence we have 0{{n + Uy ■ Uf) ■ (n -|- n„)) steps that cost 
0(1). Overall, the algorithm has the complexity 

0{{n+ny -n f) ■ {n+Ui-n f) + {n+ny -n f) ■ {n+Uy) -1) = 0{{n+ny-nf)-{n+rii-nf)) 

Since 0(n/) = 0{n) and 0(ny) = 0{rii ■ n/) = O(n^), this bound is 0{n^). 
If the number of features is bounded, 0{ny) = 0{rii) = 0(n), so the bound is 
rather 0{n^). □ 

Notice that a constraint system of records with first-class record labels is 
obtained as an obvious restriction of OF and the above result implies the same 
time complexity bound as OF. 

Proposition 3. Every OF constraint ip which is closed under the rules in Figure 
Cl (and hence is different from fail) is satisfiable. 

Proof. For the proof, we need to define a notion of path reachability similar to 
the one used in earlier work, such as For all paths tt and constraints p, 

To be precise, each step costs 0{n + m ■ Uf + a{ny)) which we sloppily simplify to 
0{n + m ■ Uf). 



3 



Type Inference for First-Class Messages with Feature Constraints 



179 



we define a binary relation where x y reads as “y is reachable from x 
over path tt in 



V 

X X 

V 

X'^eV 

V 

X '^f y 

V 

X y 



for every x 

if y = xG(fi or x = yGip 

if x[f]y G (fi 

if X '^TT z and z y. 



Define relations x '^,r a meaning that “label a can be reached from x over path 
7T in (/?”: 

X a if X y and a{y) € tp 

Fix an arbitrary label unit. For every closed constraint tp we define the mapping 
a from variables into feature trees defined as follows. 

-Da(x) = {tt I exists y : x^^,y} 

La{x) = I a; a} U {(tt, unit) | tt S Do,(a;) but ^a'-X'^T^a} 

It remains to be shown that a defines a mapping into feature trees for closed con- 
straints p, and that a indeed satisfies p. This can be done by a straightforward 
induction over paths tt. □ 



2.3 Relation to Feature Constraint Systems 

We compare OF with feature constraint systems in the literature: Given a two- 
sorted signature with variables a;,y,z, ... and u,v,w,... ranging over feature 
trees and features, resp., collections of feature constraints from the following list 
have, amongst others, been considered 131271301 : 

(p ::= x = y \ a{x) \ x[f]y \ Fx \ u = f \ x[u]y \ p^p' 

The constraints x = y, a(x), and x[f]y are the ones of OF. The arity bound Fx 
(where F is a finite set of features) states that x has exactly the features in F 
at the root. 

Ft if ar(r) = F 

Apparently, both arity constraints are interreducible by means of disjunctions: 
F{x) ^ \/ p,^p F'x. The constraints of FT [2| contain x = y, a(x), and x[f]y, 
CFT |23 extends FT by Fx, and EF |2D| contains the constraints x=y, a{x), 
X = f, Fx, and x[u]y. 

The satisfiability problems for FT and CFT are quasi-linear fZI]. In con- 
trast, the satisfiability problem for EF is NP-hard, as Treinen shows by reducing 
the minimal cover problem to it |3()I9| . Crucial in this proof is the following 
implication 

n 

{fi, ■ ■ ■ , fn}x A x[u]y \Ju = fi 

2=1 
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In order to express a corresponding disjunction in OF, we need existential quan- 
tification and constraints of the form ^{}(j/) 

n 

{/i,---,/n}(a:) Ax(y)zA^{}(y) \j3ziy[fi]z^ 

i=l 

With this new constraint we reduce the satisfiability check for EF to the one for 
OF. 

Proposition 4. There is an embedding |-] from EF constraints into OF with 
negative constraints of the form ^{}(a;) such that every EF constraint ip is sat- 
isfiable iff |</3] is. 

Proof. Labeling a{x) and equality x = y translate trivially. Now assume two 
special labels unit and lab; we use these to represent labels / in EF by feature 
trees lab(/:unit). 

|it = /] = 3a; (lab(u) A {/}(m) A u[/]a; A unit(x) A {}(a;)) 

lx[u]yj = x{u)yA^{}{u) 

l{fi, ■ ■ ■ , fn}xj = {/i,...,/„}(a;) A Ar=i32/a;[/*]?/ 

To show that satisfiability of an EF constraint ip implies satisfiability of the 
OF constraint |(^] we map every EF solution of (p to an OF solution of |(/3] 
by replacing every feature / by lab(/:unit) and every feature tree of the form 
a{f-.T...) by a(/:unit^T...). 

For the inverse, we take a satisfiable OF constraint |(/j] and construct an 
OF solution of |;/5] which maps all variables u in selector position in x{u)y to a 
feature tree with exactly one feature. From this solution we derive an EF solution 
of ip by replacing these singleton- feature trees by their unique feature. Notice 
that the solution constructed in the proof of Proposition 0 does not suffice since 
it may map u to a feature tree without any feature. 

Formally, we extend the definition of path reachability by x F meaning 
that “arity bound F can be reached from x over path tt in : 

X F \i X y and F{y) G ip 

We assume an order on F(ip), and, for non-empty F, let min{F) denote the 
smallest feature in F wrt. this order. We define a as follows: 

Da{x) = {tt I exists y : a; y} u {tt/ | F,f = min{F)} 

La(x) = {(tt, a) I X'A,, a}U{(7T, unit) | tt & : x a} 

By the constraint in the translation of x{u)y we know that, for closed 

and non-failed ip, the set F must be non-empty in the first line. Hence, a is a 
well-defined mapping into feature trees. It is easy to show that a satisfies ip and 
that a corresponds to an EF solution as sketched above. □ 

Corollary 1. The satisfiability problem of every extension of OF that can ex- 
press “’ll (a:) is NP-hard. 

For example, the satisfiability problem of positive and negative OF constraints 
is NP-hard. 
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3 Type Inference 



We reformulate the type inference of m in terms of OF constraints. As in 
that paper, we consider a tiny object-oriented programming language with this 
abstract syntaxO 



M ::= b 

I X 

I f{M) 

I {fl{xi) = Mi,...,fn{Xn) 
I M ^ N 
I let y = M in 



(Constant) 

(Variable) 

(Message) 

M„} (Object) 

(Message Passing) 
(Let Binding) 



The operational semantics contains no surprise (see ^3). For the types, we as- 
sume additional distinct labels msg and obj to mark message and object types, 
and a set of distinct labels such as int, bool, etc., to mark base types. Monomor- 
phic types are all feature trees over this signature; monomorphic type terms are 
feature terms: 

t ::= a (Type variable) 

I int I bool I . . . (Base type) 

I msg(/i:ti,...,/„:t„) (Message type) 

I ohj{ fi:ti^t[, . . . , fn-tn^t'J (Object type) 



Somewhat sloppily, we allow for infinite (regular) feature terms such as to in- 
corporate recursive types without an explicit p, notation. Recursive types are 
necessary for the analysis of recursive objects. We assume a mapping typeof 
from constants of base type to their corresponding types. We also use the kindl- 
ing notation x :: a(/i:ti, . . . , fn-tn) to constrain a; to a feature tree whose arity 
is underspecified, e. g., a(x) A A"=i x[A]ti. 



Monomorphic Type Inference. The monomorphic type system is given in 
Figure 0 As usual, T is a finite mapping from variables to type terms and 
F;x : t extends T so that it maps variable x to t. The type system defines 
judgments (p, F \- M : t which reads as “under the type assumptions in F 
subject to the constraint p, the expression M has type t” jU the constraint ip 
in well-formed judgements is required to be satisfiable. We do not comment 
further on the type system here but refer to m for intuitions and to for 

notation. The corresponding type inference is given in Figure 0 as a mapping F 
from a variable x and a program expressions M to an OF constraint such that 
every solution of x in T{x, M) is a type of M. For ease of reading, we use the 
bound variables in program expressions as their corresponding type variables. 

In contrast to drop letobj and allow let to introduce recursively defined 

expressions. 

® This terminology is slightly sloppy but common: Since t may contain type variables 
it is rather a type term than a type and it would be accurate to say that M has 
“some type matching t” . 
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X : t & r 

(fi, r \- X : t 



Var 



ip,r\- M :t' ip ^OF t :: msg(/ : t') 
Const — ; Msg 



ip,r \- b \ typeof (6) \- f{M) : t 

ip, r-,Xi : ti \- Mi : t'i for every i = 1, . . . ,n 

ip,r\- {/i(a;i) = Ml,.. . ,/„(a;„) = M„} : . . . , 

p,r\-M:ti (p,r\-N:t2 |=of obj(ii) A msg(t 2 ) A ti(t2)t3 



Obj 



p, P M < — N : ts 

p, P-,y : ti \- M : ti p, r-,y : ti \~ N : t2 
p, r \et y = M 'm N : t2 



MsgPass 



Let (monomorphic) 



Fig. 2. The Monomorphic Type System 



I{x, b) 

T(x,y) 

I(x,f(M)) 

I(x,{fi(xi) = Mi,...,f„{Xn) = Mn}) 

I(x, M ^ N) 

I(x, \ety = M in N) 



a{x) A {}(x) if a = typeof(&) 

x = y 

(msg(x) A x[f]y A I(y, M)) 
obj(i:)A{/i,...,/„}(x)A 
/\"_j 3xj 3x' 3z {x[fi]x' P x' = Xi ^ z /\ X{z, Mi)) 
3 y 3z (y{z)x A obj(y) A I{y, M) A msg(2:) !\X(z, N)) 
3y(X(y,M)AX(x,N)) 



Fig. 3. Monomorphic Type Inference for First-Class Messages with OF Con- 
straints 



Correctness of the type inference with respect to the type system is obvious, 
and it should be clear that soundness of the type system (w^ respect to the 
assumed operational semantics) can be shown along the lines giveiwli fS]- The 
type inference generates a constraint whose SKe is propoiflonaJ to the size of 
the given program expression. Hence, we kncrwifrom Proposicibn 0 that type 
inference can be done in polynomial time and spM;eH 

Let us give some examples. To reduce the verbosity of OF constraints, we 
shall freely use feature term equations as introduced above. First, the statement 

let ol = {succ(x)=x-|-l, pos(x)=x>0}; 

defines an object with two methods succ : int— >int and pos : int^bool. Type 
inference gives the type of this object as an OF constraint on the type variable 
Ol equivalent to 

ipi = Ol = obj(succ : int— ^int, pos : int^bool). 



To be precise, we have to show that every satisfiable OF constraint derived by type 
inference is satisfiable in the smaller domain of types; this is easy. 



Type Inference for First-Class Messages with Feature Constraints 



183 



A delegate object for the object ol is defined as follows: 
let o2 = {redirect(m)= ol ^ m}; 

where m is a parameter that binds messages to be redirected to ol. Assuming 
the variable oi to be constrained by ip \ , the constraint (^2 restricts 02 to the type 
of o2: 

(^2 = 3m 3z (02 = obj(redirect : A oi(m)z A msg(TO)). 

The return type of a message passing to this object, e. g., 
let w = o 24— redirect(succ(l)); 

is described as the solution of f\ip2 ^ for the type variable w, where 

(^3 = 3z' {o2{z')w A z' :: msg(redirect : msg(succ : int))), 

The solved form of A (/?2 A (^3 contains the primitive constraint int(r<;), which 
tells the intended result type int. 

If ol does not respond to the message argument of redirect, for instance as 
in 



let V = o2^ redirect(pred(l)), 

a type error is detected as inconsistency in the derived constraint. Here, the 
constraint 

(^4 = 3z' {o2{z')w' A z' :: msg(redirect : msg(pred : int))) 

implies Bz' (oi(z')ia' Az' :: msg(pred : int)), and hence that oi has a feature pred 
which contradicts by an arity clash. 

Now recall that in OF the implication x{y)z Ax{y)z' ^ z = z' does not hold. 
The following example demonstrates how this weakness affects typing and type 
inference. 

let ol = {a(x)=x-|-l, b(x)=x>0} in o2 = {b(x)=x=0,c(x)=x*2} 
in o3 = {foo(m)= begin ol^ m; o2<— m end}; 

It is easy to see that the foo method always returns bool, since the argument 
message of foo must be accepted by both the objects ol and o2, which share 
only the method name b. However, type inference for this program derives (es- 
sentially) the constraint 

Ol = obj(a : int^int, b : int^bool) A 02 = obj(b : int^bool, c : int^int)A 

03 = obj(foo : TO— >z) A oi(to)2i A 02{m)z2 

Herein, the result type 2 of the method foo is neither entailed to equal z\ nor 22- 
This is reasonable since the message m in this program is not sent and hence 
may safely return anything. By a similar argument, the following program can 
be considered acceptable: 
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let ol = {a(x)=x+l} in o2 = {c(x)=x*2} 

in o3 = {foo(m)= begin if b then ol^m else o2^m end} 

One may complain that this kind of methods should be detected as a type 
error. Manipulating the type system and the type inference to do this is easy: 
One just needs to exclude types msg(), i. e., message types without any feature. 
However, recall that the polynomial time complexity of the analysis depends on 
this weakness: The corresponding clause in the type inference 

I{x, f{M)) = 3y (^{}(a;) A msg(a:) A x[f]y A I{y, M)) 

generates OF constraints with negation such that constraint solving (and hence, 
type inference) would become NP-hard0 

Polymorphic Type Inference. We can obtain the polymorphic type infer- 
ence by applying the scheme HM(X) |2S|- The constraint system OF is a viable 
parameter for HM(X) since it satisfies the two required properties, called co- 
herence and soundness. Both rely on a notion of monomorphic types, in our 
case, given by feature trees; it does no harm that these may be infinite. The 
coherence property requires that the considered order on types is semantically 
well-behaved; this is trivial in our case since we only consider a trivial order 
on feature trees. The soundness property that a solved constraint indeed has a 
solution follows from Proposition 0 

Comparison with Nishimura. In Nishimura’s original type system HS|, 
abbreviated as T> in the following, constraints are modeled as kinded type 
variables. The bindings have a straightforward syntactic correspondence with 
OF constraints: the message binding x :: {{fi-ti, . . . , fn'tn))F corresponds to 
X :: msg(/i:ti,...,/„:f„)AP(a;) and the object binding a; :: {yi^h, . . . ,yn^t„] p 
corresponds to obj(a;) A A^i ^ ^i^)- 

Our reformulation HM(OF) of T> is in the same spirit as the reformulation 
HM(REC) |2n| of Ohori’s type system for the polymorphic record calculus: Both 
recast the kinding system as a constraint system. One might thus expect the 
relation of V and HM(OF) to be as close as that between Ohori’s system and 
HM(REC) which type exactly the same programs (“full and faithful”); this is, 
however, not the case. 

There is a significant difference between the the kind system in T> and OF. 
In T>, kinded types may contain variables, e. g., an object returning integers as a 
response to messages of type y receives the kind (y— >int}p,. On unifying two types 
with bindings (y— >int}p and \y^z\p, the type inference for T> unifies z and int 
since it is syntactically known that both z and int denote the type of the response 
of the same object to the same message. Thus in T>, the name of type variables 
is crucial. In this paper, variables only occur as part of type descriptions (i e., 

^ The altered type inference does still not exactly correspond to the original. For 
example, we would not accept message sending to an empty object as in {} ^ m, 
while the original one does. 
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syntax) while the (semantic) domain of types does not contain variables. E. g., 
we understand {y^int} not as a type but as part of a type description which can 
be expressed by a constraint like obj(a;) A a;(y)int. 

As a consequence, well-typedness in our system does not depend on the choice 
of variable names but only on the type of variables. This is usual for ML-style 
type systems but does not hold for T>. Consider the following example: 

{foo(m) = (o^m) -I- 1; (o^m) true} 

This program is accepted by the OF-based type system, since the constraint 
o(77i)int A o(m) bool is satisfiable. The type system T), however, rejects it after 
trying to unify int and bool during type inference. To insist that this is a syntactic 
argument notice that T> accepts the following program, where o is replaced by 
the object constant {}: 

{baz(m) = (ll^m) + 1; ({}^m) &. true} 

4 Conclusion 

We have presented a new constraint system OF over feature trees and investi- 
gated the complexity of its satisfiability problem. OF is designed for specifica- 
tion and implementation of type inference for first-class messages in the spirit 
of Nishimura’s system m- We have given a type system for which monomor- 
phic type inference with OF constraints can be done in polynomial time; this 
system is weaker than the original one, but the additional expressiveness would 
render monomorphic type inference NP-hard as we have shown. Given OF, we 
can add ML-style polymorphism by instantiating the recent HM(X) scheme to 
the constraint system OF. 
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Abstract. Type-directed partial evaluation is a new approach to pro- 
gram specialization for functional programming languages. Its merits 
with respect to the traditional offline partial evaluation approach have 
not yet been fully explored. We present a comparison of type-directed 
partial evaluation with standard offline partial evaluation in both a qual- 
itative and quantitative way. For the latter we use implementations of 
both approaches in Scheme. Both approaches yield equivalent results in 
comparable time. 



1 Introduction 

Partial evaluation is a technique for automatically specializing programs. One 
approach is offline partial evaluation where specialization is entirely driven by 
the results of a program analysis. We consider two offline frameworks which are 
superficially quite different. 

In the traditional approach the program analysis is a binding-time analysis, an 
abstraction of the semantics of specialization. A binding-time analysis determines 
for each program point whether the specializer should generate code or execute 
it. 

An offline partial evaluator is then either an interpreter of annotated pro- 
grams or it follows the approach of writing cogen by hand m Self-application 
of the partial evaluator improves the performance of the interpreter-based alter- 
native. It can transform the annotated program into a dedicated program gen- 
erator. The cogen-by-hand alternative performs this step directly, thus avoiding 
any interpretive overheads. The latter alternative is the current state of the art 
in partial evaluation PJE3ISIE21C3 • 

In the type-directed approach US) it is only the type of the program (and its 
free variables) that drives the specialization. It avoids interpretation completely 
by applying to compiled programs without necessarily requiring access to their 
source, as long as their types are known. It guarantees that the specialized 
program is in normal form. 



Jieh Hsiang, Atsushi Ohori (Eds.); ASIAN’98, LNCS 1538, pp. 188-|2nSI 1998. 
(c) Springer- Verlag Berlin Heidelberg 1998 
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The aim of this work is to clarify the relationship between these two ap- 
proaches. Some of the issues considered have already been touched upon in 
Danvy’s article “Type-directed Partial Evaluation” but some delicate issues 
were left implicit. Therefore, we have set out to clarify the principal differences 
and commonalities of the two approaches and we have performed some practical 
experiments using representative implementations of both approaches. 

For the traditional partial evaluators, we have used the PGG system Eg. It 
is an implementation of the cogen-by-hand approach for the Scheme language. 

For type-directed partial evaluation (TDPE), we have used Danvy’s imple- 
mentation for Scheme Ennn and our own reimplementation in Scheme to per- 
form experiments with self-application of TDPE. 

Although both systems are based on Scheme, the two approaches are appli- 
cable to other functional languages like ML and Haskell. 

Our experiments concentrate on compilation, the flagship application of par- 
tial evaluation. Gompiling with a partial evaluator means to specialize an in- 
terpreter for a programming language L with respect to an L-program. We call 
such an interpreter as well as any program submitted to a specializer the subject 
program. We have found that: 

— both approaches to specialization require changes to the subject programs 
to specialize well; 

— if the subject program has a simple type and polyvariant program point 
specialization is not required then TDPE can simulate traditional partial 
evaluation without change, otherwise if the subject program is untyped some 
transformations are required to hide the untyped parts from TDPE; 

— in all cases, a traditional partial evaluator can achieve the result of a special- 
ization with TDPE, subject to improving binding times using eta-expansion 
[ig and repeating the binding-time analysis; 

— both specialization methods have comparable runtimes (excluding type in- 
ference, compilation, etc). 

These observations hold for TDPE as originally defined m Recently, Danvy 
and others [TT)ll4 1 j have defined online variants of TDPE that are strictly more 
powerful and yield strictly better results than traditional methods. To match 
these results, a traditional partial evaluator must revert to multiple passes in- 
cluding aggressive post-processing m- 

The rest of the paper is structured as follows. In Sec. El we introduce tra- 
ditional partial evaluation as well as TDPE. Section 01 formulates important 
aspects of specialization and puts them into perspective with respect to both 
approaches. Section 01 shows how both approaches relate to each other in terms 
of an eta-expanding binding-time analysis. Section 01 analyzes the run-time be- 
havior of both approaches. 

Throughout, we assume familiarity with basic notions of partial evaluation 

jT^FTT) . 
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expressions 


Expr 9 e ::= a; 1 c 1 6@6 | Xx.e 


types 


Type B t b \ t ^ t 


reduction 


{Xx.ei)@e2 6i{a; := 62} 


Fig. 1. 


The simply-typed lambda calculus 



two-level expressions TLTb e ::= a; | e @ e | Xx.e | e@e | Xx.e 

two-level types t ::=b \ t~t \ b \ 

static reduction (Ax.ei)@62 ei{a; := 62} 

h dyn b dyn ^2 

h wft b \~ -uift b 

h wft ^2 h wft h wft t2 P wft tl P dyn ^2 P dyn tl 

P wft ^2 P wft 

A{x : t} \~ua X : t 

A~^X . ^2} I bta • tl I wft ^2 *tl ^ I 6 ta ■ ^2 2 I I bta 62 • ^2 

A \~bta X x.e : t2~ti A \~bta ei @ 62 : ti 

. ^2} I bta 6 . tl I wft ta T l j4 I bta 6l . t2 T l ^ I bta 62 • t2 

A Pwa A 1. 6 : t2^tl A Pwa 6l @ 62 : tl 

Fig. 2. The simply-typed two-level lambda calculus 



2 Two Paradigms for Specialization 

This section presents the kernels of the two paradigms for specialization. To 
simplify our later comparison, the formal presentation focuses on a simply-typed 
lambda calculus with full (3 reduction (see Fig.P). We omit the typing rules since 
they are standard m- 

To discuss specialization, we employ the two-level call-by-name lambda cal- 
culus defined in Fig. El Overlining denotes static specialization-time con- 

structs and underlining denotes dynamic run-time constructs. 

To discuss implementations of specialization, we augment the simply-typed 
lambda calculus with a datatype Expr for generated or specialized expressions. 
The operations on Expr are mkLam : Var x Expr —>■ Expr, which constructs a 
lambda expression, and mkApp : Expr x Expr Expr. In addition, the impure 
operator gensym : Unit Var generates a new variable every time it is called. 
Most of the time we are using the derived operations shown in Fig. |3 

The specifications are for specializing a call- by-name lambda calculus. Spe- 
cializing the call-by- value lambda calculus or the computational lambda calculus 
(call-by-value with certain side effects) requires identical changes to both speci- 
fications. 
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A / = let y = gensym () in mkLam{y,fy) 

f @ a = mkApp{f, a) 

Fig. 3. Derived operations on Expr 



cogen : TLT Impl 
cogen{x) = x 

cogen{Xx.e) = Xx.cogen{e) cogen{Xx.e) = X X x.cogen{e) 

cogen{f @ a) = {cogen{f)){cogen{a)) cogen{f @a) = {cogen(f)) @ (cogen(a)) 

Fig. 4. Cogen-based offline partial evaluation H2| 



2.1 Cogen-Based Offline Partial Evaluation 

A natural approach to implementing a cogen-by-hand partial evaluation system 
is to provide an implementation of each of the constructs of the two-level lambda 
calculus I221E2]- In this case, the generation of the dedicated specializer boils 
down to a trivial translation that replaces each construct in a term of the two- 
level calculus by its implementation. 

Figure 0 shows such an implementation in terms of a mapping cogen from 
two-level expressions TLT to an implementation language Impl. The A combina- 
tor interprets the dynamic lambda using a higher-order syntax representation, 
the @ combinator interprets dynamic application. The static constructs map 
into the corresponding constructs of the implementation language. 



2.2 Type-Directed Partial Evaluation 

The heart of an implementation of TDPE is a pair of mutually recursive func- 
tions, reify and reflect, shown in Fig. 0 Reify, written as v, accepts a type 
t and a value v of type t and converts it into an expression of type t. Reflect, 
written tt e, accepts a type t and an expression e of type t and converts it into a 
value of type t. Type denotes simple types. Value stands for (compiled) values, 
Expr denotes (specialized) expressions and TLT are two- level expressions. 

Reify at a function type constructs a lambda expression and applies the 
corresponding function v from the compiled program to the freshly generated 
variable. Reflect constructs an application expression and wraps it into a function 
to transport it to the correct place in the compiled program. 
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I : Type Value — > Impl | : Type — > Expr — > Impl 

reify v = v 

y — igj- y — ggnsym () in mkLam{y, (r>(Tti y))) 
reflect e = e 

Tti^ta e = \ v. tt2 {mkApp(e,fl^ v)) 

Fig. 5. Specification of TDPE 



3 Aspects of Specialization Techniques 

Each of the following subsections considers one aspect of offline specialization 
techniques and comments on its significance for the traditional as well as for the 
type-directed approach. 

3.1 Types 

Types play an important role for both approaches H3 p.247, sec.2], TDPE 
specializes by normalizing a compiled object with respect to its simple type. 
In practice, the specialized code has a simple type while the computation at 
specialization time can be untyped. The source program has to be closed with 
respect to primitive operations and the dynamic values because that is the only 
way their types can be specified. In contrast to many traditional systems, TDPE 
does not admit recursive types since there is no general way to reflect dynamic 
values. 

In many traditional systems, the binding-time analysis imposes a certain type 
discipline upon the language, mostly based on simple typing with partial types 
and recursion (Lambdamix Henglein’s analysis 123, Similix 5.0 P], and in 
particular the PGG system m)- The binding-time analysis defers untypable ex- 
pressions to run-time. Furthermore, all expressions that depend on dynamic data 
are also made dynamic. In consequence, the specialization-time computation has 
a simple type with recursion whereas the specialized program is unconstrained: 
it can be arbitrarily untyped and/or impure. 

In principle, a binding-time analysis could also be based on more powerful 
type systems. For example, systems with subtyping 03, soft typing El , or 
dynamic typing m could provide a suitable basis. 

While the type of the subject program is usually fixed in the traditional 
approach, this need not be the case for TDPE. This flexibility is due to the 
interpretive nature of the reify / reflect pair of functions, both of which interpret 
their type argument. In terms of the compilation paradigm, it means that the 
type of the interpreter may depend on the type of its source program. This has 
two consequences: 

1. it is possible to perform “simple type specialization” mi which can remove 
type tags in a typed interpreter during specialization; 
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2. it is no longer sufficient to know the type of the subject program (inter- 
preter) before starting a specialization, but it is necessary to know the type 
of the subject program applied to the static input, that is, the type of the in- 
terpreter applied to its source program. Effectively, it boils down to perform 
type inference of the source program before being able to compile it. In gen- 
eral, this type inference cannot be done by the compiler which compiles the 
interpreter, but there must be a separate program to type check the source 
program. 

3.2 Polyvariance and Program Point Specialization 

Polyvariant specialization means: 

Any program point in the source program gives rise to zero or more 
program points in the specialized program Enmni- 

Both techniques, traditional partial evaluators and TDPE, perform polyvariant 
specialization in this sense. 

A related, but different issue is concerned with program point specialization. 
This technique uses memoization points to avoid code duplication and poten- 
tially infinite specialization. The specialized programs are sets of mutually re- 
cursive function definitions. 

Program point specialization is a standard feature of traditional partial eval- 
uators [Slini, which has not yet been achieved for TDPE. It gives raise to a 
number of problems, for example, traditional systems must use a closure-based 
representation for functions to be able to compare them. 



3.3 Continuation-Based Reduction 

Continuation-based reduction is a commonly used name for a contextual binding- 
time-improving transformation which takes place during specialization [7j . Con- 
tinuations are one implementation technique for it. Its main use is the imple- 
mentation of non-standard reductions in two-level calculi and of dynamic choice 
of static values m- Both systems make use of the same technique to guaran- 
tee sound specialization in the presence of computational effects PSOni- Both 
systems have options to switch this feature off for selected operators. For these 
operators, the systems perform a call-by-name transformation (i.e., the operator 
has no side effects). 



3.4 Computational Effects 

There are two dimensions of computational effects to consider: effects at special- 
ization time and effects in the specialized program. 

Since TDPE does not constrain the computation at specialization time, side 
effects during specialization are permitted as long as they respect the typing 
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properties. Recent results enable traditional systems to perform impure compu- 
tations at specialization time, too m- Two important problems here are side ef- 
fects under dynamic control and polyvariant program point specialization, which 
are discussed elsewhere in this paper. 

Both approaches handle computational effects in the specialized program in 
essentially the same way [.'ihl'Zhj . The necessary modifications to Figures 0 and El 
are almost identical P^l4h| . 



3.5 Compilation vs. Interpretation 

TDPE as described by Danvy HSl applies to a compiled program and hence 
performs specialization at full compiled speed. Indeed, this observation is one of 
Danvy’s key contributions because the idea of normalization by evaluation has 
been developed in an interpretive setting El- 

A traditional partial evaluator accesses the source program to perform the 
binding-time analysis. Then it interprets the resulting annotated program. A 
cogen-based partial evaluator turns the annotated program into a program gen- 
erator as indicated by the cogen function. The program generator has (assuming 
a monovariant binding-time analysis) about the same size as the original pro- 
gram and the static parts run at full compiled speed, since their translation is 
trivial. 

Specialization with either system involves compilation: 

— The source program must be compiled to be amenable to TDPE. 

— In a cogen-based system, the cogen function maps a binding-time annotated 
program to a program generator. The generator is then compiled before it 
can perform specialization. 

There is a difference in the kind of program that is compiled in the two sys- 
tems. In the cogen-based system, the compiled program consists of the static 
part of the subject program interspersed with syntax constructors, which gen- 
erate code during specialization. TDPE, on the other hand, uses the original 
program in compiled form. This may result in a more optimized compiled pro- 
gram. 



3.6 Termination 

Neither TDPE nor traditional systems can guarantee the termination of special- 
ization if there are static fixpoints present in the subject program. However, if 
the subject program is restricted to the simply-typed lambda calculus (without 
recursion) then TDPE is sure to terminate, by the strong normalization prop- 
erty of the calculus. A traditional system can employ an extra program analysis 
which guarantees the termination of specialization ESIEI- 
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3.7 Binding-Time Improvements 

A binding-time improvement is a program transformation geared at improving 
the effectiveness of specialization. This goal is usually achieved by using special 
programming idioms to separate the binding times. 

Traditional specializers require binding-time improvements m whereas 
TDPE does not ^01 El- However, achieving satisfactory results is not trivial 
using TDPE, either. It is still necessary to address some issues. 

Staging Discipline The subject program must obey a staging discipline. It 
means that the subject program must compute all values that we expect to 
see in the specialized programs. This is non-obvious so we illustrate it using an 
example. 

Returning to the compilation application, consider the specialization of an 
environment lookup p{v) in an interpreter. Obviously, the name v of the vari- 
able should disappear in the specialized (compiled) program. Instead, we expect 
something like lookup store 5 as compiled code. But to this end the inter- 
preter must actually compute the index/address 5 and perform the environment 
lookup in two stages: in the first (static) stage, compute the index using an ad- 
ditional (static) environment; in the second (dynamic) stage, perform lookup 
store index. This is quite similar to separating binding times, which is what 
binding-time improvements are about. 

The work of Harrison and Kamin m provides some more evidence for this 
fact. For example, they must precompute some labels to compile conditional 
jumps. Their solution is to have the interpreter interpret the code in two passes: 
in the first pass, it computes the addresses, and then it “snaps back” to perform 
the actual computation in the second pass. 



Dynamic Recursion A program contains dynamic recursion if there is at least 
one loop that is controlled by run-time (i.e., dynamic) data. This is a frequent 
phenomenon, since most non-trivial programs contain dynamic recursion. For 
instance, an interpreter will interpret loops or recursive functions in its source 
language by dynamic recursion. Traditional partial evaluation memoizes calls 
to the specialize!' at points that may give rise to dynamic recursion: dynamic 
conditionals and dynamic lambda abstractions 0 . Thus it creates a set of top- 
level mutually recursive functions, one for each memoized call to the specializer. 

This approach is currently not possible with TDPE. Instead, the general idea 
to deal with dynamic recursion is to provide a fixed-point combinator at a 
suitable type and abstract the subject program over the fixed-point operator. 
This may not be easy as demonstrated with the following example, again dealing 
with compilation. 

Suppose we want to compile the language Mixwell which provides mutu- 
ally recursive of first-order functions. The typical semantics of such a language 
includes semantic equations £ : Exp FunctionEnv ValueEnv — > Value for 
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expressions and goes on to provide the meaning of a program fi{xi) = as the 
environment 



fix Xip & FunctionEnv.[fi ^ A yi.E\ei\ij}%[xi yi]] 

It turns out that it is not straightforward to come up with a suitable fixed-point 
combinator fix. 

One strategy (suggested by DanvyO) is to implement the function environ- 
ment t/j as a, function and use the fixed-point combinator at a different type for 
each program: this turns out to be problematic because the type of 'if niust re- 
flect the arity of each fi. In other words, the type of the result of iffi depends on 
the particular argument fi . Unfortunately, this kind of type is currently outside 
the scope of TDPE, unless we employ the trick to hide the actual arity of the 
functions inside of some kind of “dynamic” type. To do so, we have to rewrite 
the interpreter by including coercions into and out of the dynamic type. But 
still, the type of ip — and hence of fix — depends on the number of functions fi in 
the source program. 

If we take this last idea to the extreme, we implement the function environ- 
ment pj as an abstract type with lookup and update functions. This works sat- 
isfactory for the Mixwell interpreter, but it is still necessary to insert coercions. 
In this case, the most natural implementation of ■!/; is a list and the implemen- 
tation of the fixed-point combinator uses an impure update operation. Anyway, 
we must split the function environment in the interpreter into a compile-time 
and a run-time part to achieve a static lookup, just like explained above for the 
variable lookup. 

Dynamic Conditional On a smaller scale, a similar problem exists for dynamic 
conditionals. In the traditional approach, the specializer inserts a memoization 
point at a dynamic conditional. This is not possible with TDPE. 

In principle, there is no problem with handling dynamic conditionals in 
TDPE jinj. However, the solution can give raise to code duplication which is 
why most examples use a workaround. The trick is to hide the conditional inside 
a primitive function and abstract it. Clearly, to avoid changing the semantics in 
a call-by-value language, such a primitive must work at a functional type. 

4 Qualitative Comparison 

In this section, we investigate the exact relationship between TDPE and an 
aggressive binding-time analysis. Although TDPE does not perform an explicit 
binding-time analysis, it does so implicitly. We And that using eta-expansion 
m enables the traditional approach to achieve the same results as TDPE. We 
illustrate this with an example drawn from an applied lambda calculus. 

Consider the expression given by 

E = {Xf.f@{{tf#ffg)@0))@{Xz.z). 

^ in personal communication, March ’98. 
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j; : Type TLT TLT 

fr : Type TLT TLT 

reify e = e 

e = \x. (e@(-ft-ti x)) 
reflect ffi e = e 

e = \v. -(]-t2 (e@ v) 

Fig. 6. TDPE using two-level eta-expansion m 



Assuming that g is known to be a function, but otherwise unknown (dy- 
namic), we explore three different binding-time annotation schemes. But first, 
we rephrase TDPE so that it maps two-level expressions to two-level expressions. 

4.1 TDPE and Two-Level Eta-Expansion 

If we rewrite TDPE to a syntactic transformation on two-level terms as in Fig. El 
it turns out that this new specification is related to the implementation in Fig. El 
via the cogen function. 

Lemma 1. Suppose e S TLT with all annotations static and \~bta e : t. Then 

1- bbiaJJ.* e : t where the underlying type of t is identical to t, but with all 
annotations dynamic; 

2. I* cogen{e) = cogen{if* e). 

4.2 Standard Binding-Time Analysis 

A standard binding-time analysis 123 (as implemented in PGG or in Similix) 
yields the two-level term 

(A/./@((?#f / <?)@0))@(Az.z) (2) 

which statically reduces to 

(A z.z) @ {g @0). 

This result is not satisfactory because the dynamic g pollutes the surrounding 
binding times. 

4.3 Eta-Expanding Binding-Time Analysis 

Since the type of the free variable g is known to be A ^ A for some base type 
A, a prephase of the binding-time analysis could exploit this knowledge and 
eta-expand g to yield 



(A /./ @ {{if #f / (A w.g @ w)) @ 0)) @ (A z.z). 



( 3 ) 
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The same could be done with the type of the result. However, in the present 
example the result has type A, a base type. 

Applying the same binding-time analysis as before to the term (0 yields 

(A /./ @ ((if #f / (A w.g @ w)) @ 0)) @ (A z.z) (4) 

and static reduction of this term results in 

g@0. 



4.4 Type-Directed Partial Evaluation 

To simulate a binding-time analysis for TDPE, we first have to construct a 
completely static variant of E (as we only have the compiled program at our 
disposal) and close it over g\ 

E' = Xg.(Xf.f@((If#f f g)@0))@(Xz.z) (5) 

The term E' has type t = (A — > A) — > A. Performing type-directed partial 
evaluation is the same as expanding E' and then compiling and running the 
result. In this particular case, [ ] corresponds to the context: 

A g.[ ] @ A w.g@w 



The resulting term 

Xg.(Xg.(X f.f @ ((if#f f g)@ 0)) @ (X z.z))@Xw.g@w (6) 

is statically convertible with the term (gj constructed by binding-time analysis 
after eta expansion. One static reduction step brings us to 

A5-(A/./@ ((?#f / (Xw.g@w)) @0)) @ (Xz.z) (7) 

which is identical to up to the outermost abstraction of g. It statically 
reduces to 

Xg.g@Q. (8) 

It is easy to see that wrapping a free variable (or primitive operator) g whose 
type t is known into a bunch of eta-redexes corresponding to t enables the same 
reductions as TDPE. 

Lemma 2. Suppose E[g] is an expression with free variable g : t. Then TDPE- 
style eta-expansion of Xg.E[g] is statically convertible to E[g'] where each occur- 
rence of g is replaced by 5- That is 

A g.E[g] statically reduces to Xz. E[f\t z. 
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source program 



( 1 ) 



eta- expanded program 



(3) 



( 2 ) 



compiled value & 

user- annotated type ► two-level program 

(4) 



Fig. 7. TDPE and binding-time analysis 



4.5 Assessment 

Although it is possible to automate an eta-expanding prephase as in Sec. 14.51 
to the binding-time analysis, current systems leave it to the programmer to 
perform the expansion manually as a binding-time improvement m- In con- 
trast, TDPE always performs the eta-expansion because that is its fundamental 
working principle. If eta-expansion is not possible, e.g., if the top-level type 
has negative occurrences of recursive types, TDPE is not applicable, so manual 
workarounds have to be developed. In the traditional approach, eta-expansion is 
only required very rarely and it has to be done by hand, anyway. 

A conclusion to draw from this example is that an eta-expanding binding- 
time analysis anticipates some static reduction steps that TDPE has to execute. 
This anticipation is possible because every binding-time analysis relies on a flow 
analysis. TDPE has to unveil this flow on the fly. However, its flow information 
is precise — in fact perfect — because it happens at run-time and exploits actual 
values (for example, TDPE ignores semantically dead code which may deterio- 
rate the results of a static analysis), whereas a flow analysis can only provide 
conservative approximations. 

The diagram in Fig. 0 summarizes this subsection. The arrow (1) signi- 
fies type-guided eta expansion, (2) is a binding-time analysis, (3) is compila- 
tion with user-annotated type information, and (4) is the reify/reflect phase of 
TDPE. The results of (2) and (4) are statically convertible. 

4.6 Type Dependencies 

In our opinion, the strength of TDPE lies in applications where the type for 
residualization depends on the particular static input. Simple type specializa- 
tion H3 is such an application. In those cases, the traditional approach is still 
viable in the presence of an eta-expanding binding-time analysis. However, it 
would be forbiddingly expensive because we would have to perform the binding- 
time analysis for each combination of the subject program and static input. In 
addition, we would have to run cogen and to compile the resulting generating 
extension. 
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On the other hand, many applications keep the type argument of reify fixed, 
among them compilation. Here, the extra power provided by TDPE cannot be 
exploited and it seems more advantageous to use a traditional cogen-based sys- 
tem because it provides additional features like memoization for free. 

If the type argument t is fixed, self-application of TDPE d sec. 2. 4] can 
produce a customized eta-expander which is specialized with respect to t. Of 
course, any other specialization mechanism could do the job, too. 

5 Quantitative Comparison 

In this section, we report on experiments that we conducted to examine the prag- 
matics of both systems more accurately. They are all concerned with compilation 
by specializing interpreters for various languages. These are: 

Tiny d A simple imperative language with expressions, assignments, if, 
while, and sequence commands. We have examined one version in direct 
style and another in continuation-passing style. The same interpreters are 
considered in the TDPE papers ^^[O]. Since the interpreter already con- 
tained the proper combinators to handle dynamic recursion, no changes are 
required. 

Mixwell A first-order functional language with named procedures. The 
interpreter is written in direct style. As discussed in Sec. IT71 a binding-time 
improvement is required to make the interpreter amenable to TDPE. The 
changes do not affect Mixwell’s suitability for traditional specialization. 
Mini-Scheme A large subset of Scheme. We have implemented this interpreter 
to be able to run realistic compilations. To this end, we have used the pre- 
processor of the PGG system to transform Scheme into Mini-Scheme. The 
implementation uses techniques similar to those developed for the Mixwell 
interpreter. Mini-Scheme is written in continuation-passing style. 

5.1 Results of Specialization 

We have used both systems to specialize the above interpreters with respect 
to various programs. Since all programs that are amenable to TDPE can be 
processed by traditional specialization without using memoization, we switched 
this feature off in the PGG system |3S|- In all cases we were able to obtain 
specialized programs that were identical up to renaming of bound variables. 

— The Tiny interpreter in direct style is directly amenable to specialization 
with PGG without any modification. The residual programs are identical to 
those produced by TDPE. 

— The Tiny interpreter in continuation-passing style needs some binding-time 
improvements for specialization with the PGG system in the form of eta 
expansions as described in Sec. 14. ,41 

— The Mixwell interpreter — in its TDPE-ready version — is directly amenable 
to PGG specialization and yields identical results from both specializers. 
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— The Mini-Scheme interpreter is equally amenable to TDPE and PGG and 
produces identical residual programs. In this case, the interpreter requires 
no eta expansion although it is written in continuation-passing style. This is 
due to the use of dynamic primitives in direct style. 

5.2 Compile Times 

In the experiments, we have specialized the Mini-Scheme interpreter with re- 
spect to realistic programs taken from Andrew Wright’s Scheme benchmark 
suite: Boyer a term-rewriting theorem prover, Gpscheme an implementation 
of genetic programming, and Matrix tests whether a random matrix M G 
{-|-1, —1}"^" is maximal under permutation and negation of rows and columns. 
Append is just a standard list appending program. 



program 


A: TDPE 


B: TDPE-cogen 


C: PGG 


A/B 


A/C 


B/C 


Append 


0.16 


0.15 


0.12 


1.07 


1.33 


1.25 


Boyer 


6.55 


5.38 


4.73 


1.22 


1.38 


1.13 


Gpscheme 


10.13 


8.85 


8.87 


1.14 


1.14 


0.99 


Matrix 


15.22 


13.16 


12.46 


1.15 


1.22 


1.05 


Compiler Generation 




0.15 


4.92 






0.03 



Table 1. Gompile-time benchmarks (in sec.) 



Tabled shows compile times in seconds for these programs. The TDPE col- 
umn indicates the times obtained with Danvy’s TDPE system and with our own 
reimplementation of it. The TDPE-cogen column indicates the times obtained 
with self-applied TDPE sec. 2. 4]. To obtain these figures, we used our reim- 
plementation of TDPE. The PGG column lists the figures obtained using the 
PGG system. The remaining columns list ratios as indicated in their headers. 

All experiments were performed on an IBM PowerPG Workstation with 
I28MB RAM using Scheme48, an implementation of Scheme based on a byte- 
code interpreter. Each column of data was obtained with a fresh Scheme48 ses- 
sion and an heap image of 12 MB. All figures are averages of 5 consecutively 
executed runs. 

The TDPE-cogen column indicates that self-application does not play such 
a prominent role in achieving efficiency as in traditional partial evaluation. Al- 
though self-application removes the overhead of type interpretation, it only im- 
proves compile times by about 15%. Hence the interpretation of the type argu- 
ment does not contribute essentially to the specialization time. In the traditional 
approach, self-application usually speeds up the specialization by a factor of 4-6. 

Self-application at type t generates a term quite similar to IJ.* [ ]■ It does 
not depend on a particular subject program, but is generally applicable to all 
programs of type t. This is in contrast to traditional generating extensions which 
are tailored to specialize one particular subject program. 
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We did not construct a TDPE compiler generator by hand. All it would do 
is construct the same term as the TDPE-cogen. It might do it faster, but we 
believe there is not much to improve on the compiler generation time in Table Q 

5.3 An Analytical Perspective 

Let us consider the issue of compile time from an analytical perspective. Looking 
back at the core of both specialization algorithms in Figures 0 and 0 there is 
actually not so much difference between the way specialization proceeds. 

Here is the case for PGG (see Fig. 0: For each lambda in the source program, 
the specialize!' produced by the PGG system processes one lambda and — in 
the dynamic case — it generates a fresh variable name, applies the function to 
it, and constructs a lambda expression. Static applications are just applications 
and a dynamic application just constructs an application expression. 

For TDPE (see Fig. 0 it is quite similar: TDPE must process each lambda 
and each application in the source program. In addition to that, for each lambda 
that is deemed to be dynamic, it generates a fresh variable name, applies a func- 
tion to it, and constructs a residual lambda. For each application that is deemed 
dynamic, it constructs a function that constructs the residual application. 

From the above discussion, we extract the following facts: 

— processing a static lambda, a dynamic lambda, a variable, or a static appli- 
cation involves exactly the same actions for cogen-based partial evaluation 
and TDPE; 

— processing a dynamic application involves constructing the residual applica- 
tion in both cases, in addition TDPE has to construct a function to pass it 
to the compiled application. 

The numbers in table 0 which are averages of several subsequent measure- 
ments, indicate that the extra lambda construction has no serious impact on 
compile times. Indeed, after removal of type interpretation, compilation with 
TDPE is about identical to compilation with PGG. 

6 Related Work and Conclusion 

Apart from Danvy’s articles fSlOlHlE] there are few investigations on the 
pragmatics of TDPE. In his M.Sc. thesis pni, Rhiger performs experiments with 
action semantics interpreters where he compares the space and time requirements 
in a TDPE system with those in Similix. He concludes that compilation with 
TDPE is considerably faster and uses less space. 

In this paper, however, we compare TDPE with the technology used in the 
PGG system and obtain different results. This is because PGG is based on a 
hand written cogen which is faster than a self-applied partial evaluator. 

We have found that TDPE delivers the same residual programs as traditional 
partial evaluation if the latter is coupled with an eta-expanding binding-time 
analysis. Even that was rarely necessary in the examples we considered. Disre- 
garding program generator generation time, we illustrated that the time taken by 
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a cogen-based partial evaluator is similar to the time taken by a TDPE system. 
However, TDPE allows for additional flexibility since it is possible to specify a 
different type for each static input. 

The position of TDPE could be further strengthened if it were possible to 
integrate the unavoidable type inference phase into the system. The recent in- 
tegration of TDPE into an ML programming system is a promising step for- 
ward. It eliminates many of the problems in making a program “TDPE-ready” . 
However, it is still insufficient for ambitious applications like simple type spe- 
cialization CD where the implementation of a type inference engine is left to the 
user. 

The online extensions of TDPE m provide more flexibility for the user 
and achieve a certain degree of polyvariant specialization which is common in 
traditional specializers. They also enhance the transformational power of TDPE 
beyond the reach of traditional one-pass offline techniques. 
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Abstract. This paper develops an ML-style programming language with 
first-class contexts i.e, expressions with holes. A programming language 
with first-class contexts can provide various advanced features such as 
macros, distributed programming and linking modules. A possibility of 
such a programming language was shown by the theory of simply typed 
context calculus developed by Hashimoto and Ohori. This paper extends 
the simply typed system of the context calculus to an ML-style polymor- 
phic type system, and gives an operational semantics and a sound and 
complete type inference algorithm. 



1 Introduction 

A context is a term with holes in it. The basic operation for a context is to fill its 
holes with a term. For example, a term (Aa:.[-]) 5 is a context where [•] indicates 
a hole. If its hole is filled with / x, we obtain (Ax./ x) 5. In this example, x in 
f X is captured by Ax in (Ax./ x) 5. That is, hole filling is essentially different 
from capture-avoiding substitution in the lambda calculus. 

Contexts have so far been a meta-level tool and their applicability to pro- 
gramming languages has been limited to meta-level manipulation of programs. 
We believe that if a programming language is extended with first-class contexts, 
the resulting language will provide various advanced features. Let us briefly 
mention a few of them. 

Type-safe macros. Holes can be regarded as some kind of meta variables 
ranging over a set of expressions. In this sense, a language extended with first- 
class contexts can naturally express macros based on token substitution seen in 
the C preprocessor. Such macros take arguments by hole filling. If the extended 
language is typed one, then we can obtain type-safe macros. 

Distributed programming. In the setting of distributed programming, we 
send program pieces to remote sites which share common resources such as 
common runtime libraries, and then program pieces are executed there making 
bindings for necessary resources. We can regard such pieces as an open terms 
and such resources as contexts. In this sense, necessary bindings are made by 
hole filling. 
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Linking modules. A linking module which imports and exports some variables 
can naturally be regarded as a context containing free variables whose values will 
be supplied by some other contexts. One naive way of modeling a module export- 
ing a set of functions Fi,. . . ,Fn through identifiers /i , . . . , would be regarded 
as a context such as (A/i. • • • ((A/„.[-]) Fi) • • •) To link such a module and 
a program, hole filling will be performed. Since contexts are first-class, we can 
write programs which link program modules together in a dynamic manner. 

In spite of potentially many benefits including those in the above, a pro- 
gramming language with first-class contexts has not been well investigated. A 
context-enriched calculus, called AC was proposed in cn by Lee and Friedman. 
However, AC cannot represent contexts directly: in AC, contexts are clearly 
distinguished as “source code” from the ordinary lambda terms as “compiled 
code” . The effect of hole filling is formulated as source code manipulation, and 
then compiled to lambda terms. 

In 0 , Hashimoto and Ohori have developed a simply typed lambda calculus 
with first-class contexts. In the calculus, contexts are truly first-class, and sub- 
ject reduction and Church- Rosser have been proved with respect to the reduction 
which is the mixture of the ordinary [3 and fill reduction. In this paper, we estab- 
lish a basis for a realistic programming language with first-class contexts based 
on the context calculus. For the purpose, we need more flexible type system and 
a concrete evaluation strategy. We chose ML |3| as a base language. A prominent 
feature of ML is the combination of polymorphism, robustness and compile-time 
error detection. We would like to extend ML with first-class contexts preserving 
all those benefits of ML. The followings are essential to achieve the goal. 

— a polymorphic type deduction system, 

— an operational semantics and the soundness of the type system, and 

— a sound and complete type inference algorithm. 

We give an operational semantics by extending the usual call-by-value evalua- 
tion strategy using closures. Developing a polymorphic type system and a type 
inference algorithm are more complex and challenging issue. In addition to the 
problem of discovering an appropriate typing for contexts, there are two techni- 
cal difficulties to overcome. First, since contexts are first-class, we can define a 
function which receives contexts as its argument, and we can define a function 
which composes contexts. This indicates that the problem similar to the record 
concatenation will arise. Second, naive embedding of first-class contexts to ML 
type system makes types no longer first-order: the let construct introduces type 
schemes and then they may be provided through the hole in the scope of let 
binding. We solve the problems by using the techniques of typechecking record 
terms P2C3| for the former and by using the unification algorithm for polytypes 
p] for the latter. 

The rest of this paper is organized as follows. Section El introduces the target 
language informally. In Section El we define the types, terms and the semantics, 
and then prove type soundness. Section El defines the type inference algorithm. 
Finally, in Section El we discuss further investigation and conclude the paper. 
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2 Informal Discussion 

In addition to the ordinary ML constructs, we introduce labeled holes ranged 
over by X,Y,..., hole abstraction SX.e and context application ei 0 62 as in 
0. For example, a simple context (Aa;.[-]) 5 appeared in the previous section 
is written as 5X.{\x.X) 5. To represent the effect of hole filling, we write as 
{5X.{\x.X) 5)© (/ x), and then this program will be evaluated to {Xx.f x) 5. 

To see what kind of type should be given to a context, let us examine the rela- 
tionship between contexts and records. We can encode a record {h = Ci, . . . , = 

Cn} as 

r = SX.{Xli. ■ ■ ■ .Xln-X) 6i ■ ■ ■ 6n 

where field access is simulated as rQU- We can also define a function which 
composes two records as 

comp = Ari.Ar 2 .( 5 X.ri© (r 2 ©X). 

Since we would like to get the information about which variables to be captured 
through the context applications, our system essentially contains the same diffi- 
culty on typing as the one of calculi for record concatenation mm . Especially, 
in H21, Remy gives a record a type {% tt} where % and tt denote row variables. 
The intuition is that a record of such type is formulated as if it is a function: 
given any input row of fields x it returns the output row tt. For example, the 
empty record has type {x x} since it does not change the input row, and one 
element record {a = 1} has type {x; a : - ^ X! “ : int\ which adds one more 
field which is not be previously defined, where denotes that the field a is not 
present. This view of record types is quite compatible with the contexts, since 
contexts also have an aspect of function. Based on this observation, we give the 
following type for contexts. 

[pi >Ti//p 2 t>T- 2 ] 

A context of the above type takes a term of type ti as input which may have 
free variables indicated in p\, and returns a term of type T 2 with the variables 
in Pi being consumed to yield p 2 - For instance, a context 5X.{Xx.X) 5 has the 
type [x; X : int > a Hx\x : -c> a]. Note that the direction of increase of fields is 
inverted in the context. This form of context types is a generalization of simple 
context types given in |S|, and is suitable for integrating them in a polymorphic 
type system. 

Let us mention a point to notice when typing for contexts are integrated with 
ML-style polymorphism. 

Since we have let construction from ML, the following context can be de- 
fined. 

r' = (5A.let li = ei in • • • let In = e„ in A 

However, naive typing for the above expression introduces a higher-order type 
[x; h ■ o"i; . . . ; In '■ (Jn > ol//x\ h In : a] beyond type schemes of ML. In 

our system, the above expression is typed as follows. 

r' : [x;/i : [ai]; . . . -,ln ■ [<Jn]> a//x'Ji : -c> a] 
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where type annotation [ui] denotes “monomorphic” polytypes which locks in- 
stantiation of generic type variables of orignal type schemes (Ji as seen in 0 . We 
can use there polymorphic values in a context application 
where {h : [cti]; . . . ; : [cn]} is needed to keep the decidability of type infer- 

ence. We call such an annotation associated to context application an interface. 
If unification for these locked polytypes is succeeded, they are unlocked and able 
to be used as usual type schemes in e. 

With these preparations, let us show some examples of programming with 
first-class contexts. 

Typed macros. We cannot write a function which manipulates programming 
text itself in the original ML. For example, given a conditional function cond, 
we can define ifThenElse as follows. 

ifThenElse = 6X.6Y.6Z.cond{X,Y,Z) : [y[>6ooZ//x> [7r[>Q;/7r[> [i/)[>Q!/'0»Q^]]] 

where the symbol 0 associate to the left. If we use an usual function instead of 
the context, both then part and else part will be evaluated. The point here is 
that we can regard holes as meta variables. Another example is the following or 
operator which captures a variable v. 

or = 5X.5Y.\et v = X In {ifThenElseQvQvQY) 

: [x > bool Hx > [tt; f : bool > bool bool\\ 

The binding information that the above type provides will prevents the variable v 
from being unintentionally captured when programmers use this macro. We will 
be able to avoid automatically such unintended capture if we use the technique 
of hygienic macro expander which avoids name conflicts by time-stamping 
all such variables. 

Programs for distribution. We can treat programs which contain free vari- 
ables in presence of first-class contexts. In the setting of distributed program- 
ming, one often wants to send a piece of “open” program to a remote site and 
execute it there. We can define such a program as 

code = 5X.XQj.e 

where X denotes an interface. In this case, a remote site which receives such a 
program will prepare a program which at least provides bindings for variables in 
X, for example, dA.let /i = ci in • • • let /„ = e„ in A where dom{X) C {/j} 
and apply code to it. 

Linking modules. A module here is a program fragment which imports bind- 
ings for possibly free variables in the fragment, and exports some bindings. They 
are linked together to produce a complete program. We can represent a module 
which imports /i, . . . , /„ and exports gi, ... ,g„i as follows. 

M = 5X.5Y.XQ^j^,^_^^ (let = ci in • • • let gm = Cm in Y) 

■ ■ n-,gj]->allip-J^ : -c> /3]/x c> [tt; /* : rj > /* : -[>/?]] 
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The input of this context is any context which exports fi and none of gj, and 
the output is a context which exports both fi and gj . These modules are linked 
together using the following link operator N. 



mi N 7712 = SX.6Y.{miQ {m 2 Q X))qY. 

Note that N is associative and the result of applying this operator to a pair of 
modules produces again a module. To construct a complete program, we must 
prepare other two kinds of module main = SX.XQ^e which exports no bindings 
and top = JX.let hi = e; in X which imports no bindings. Then a complete 
program is produced by main® {{Mi N . . . N Mk)Qtop). 

3 The Language 

3.1 Types, Terms, and the Static Semantics 

We use the following notations for functions. The domain and the codomain of 
a function / is written as dom{f) and cod{f) respectively. We sometimes regard 
a function as a set of pairs and write 0 for the empty function. Let / and g be 
functions. We write / W g for f LI g provided that dom{f) n dom{g) = 0. The 
restriction of a function / to the domain D and to the domain dom{f) \ D are 
written as f\o and f\n respectively. 

We let V be a finite set of term variables. The set of types is given by the 
syntax: 

l ::= 0 \-\t Fields 

I I ' x:l Rows defining all variables 

except those in V 

T ::= a I 5 I T ^ r I [p® [> r/p® > t] I [cr] Monotypes 
^ ::= a I Variables 

(7 ::= T I V^.r Type schemes 

where x,y, . . . range over an infinitely countable collection of term variables, a, j3 
and 7 range over an infinitely countable collection of type variables, y, tt and t/i 
range over an infinitely countable collection of row variables, 9 ranges over an 
infinitely countable collection of field variables and b ranges over a given set 
of base types. The definitions of fields and rows are just as ones in H3| except 
for the fact that a field is tagged with a variable instead of a label. We omit 
superscripts in rows whenever we can complete them. We sometimes regard a p 
as a set of pairs of variables and monotypes, that is, (a; : t) G p if and only if 
p = p';x \ T ■■■. In addition to the ordinary set of monotypes, [pi > tiH p 2 > T2] 
denotes a context type, and [cr] denotes a poly type as mentioned before. We say 
a context type [pi i> ri//p2 t> T2] is legal if (a: : -) S pi then {x : t ) ^ p 2 , and a 
type scheme cr is legal if [a'\ does not occur anywhere other than p; x : [a'\ in a 
and if any context type in a is legal. 

The constructs Va.cr and Vy.cr bind type variable a and row variable y in 
(7, respectively. The set of free type variables and free row variables of a type 



First-Class Contexts in ML 



211 



scheme cr is denoted as FTV{a) and FRV{a), respectively. The definition of 
FTV and FRV are usual. We write FV{a) for FTV{a) UFRV{a). We identify 
type schemes up to a-conversion as usual. 

We define some notations on types which will be needed when we define a type 
system. Let denote an unspecified part of types. We write EI{[pit>»// p2f>»]) 
for the extracted interface from a context type which is the maximum set of pairs 
of variables and monotypes satisfying the condition: if (a; : t) G EI{[pit>»// p2t>»]), 
then Pi contains {x : r) and p2 contains {x : We define the set of prohibited 

variables, denoted by PV{[pt> •// p' t> •]), for the set of variables prohibited to be 
exported by a context having that type, by PV {[p:>*// p'i>u\) = {x|(a; : -) G pHp'}. 

A type substitution is a function from a finite set of type variables to the 
set of type schemes. We write {<Ti/ai, . . . , cr„/a„} for the type substitution that 
maps each to ai simultaneously. A type substitution tp is extended to the set 
of all type variables by defining (p(a) = a for all a ^ dom{ip), and in turn it is 
extended to the set of any type schemes, as usual. However, we maintain that 
the domain of a type substitution always means the domain of the original finite 
function. 

A row substitution is a function from a finite set of row variables to the set of 
rows. We use the notation for the row substitution that 

simultaneously maps each x^’ to p^’ . A row substitution is extended to the set 
of all row variables and in turn to the set of all type schemes as well as the type 
substitution mentioned above, and the domain of it is also treated similarly. 

A compound substitution, or simply substitution, is a function from the set of 
type schemes to the set of type schemes. Let p and (p be a row substitution and 
a type substitution, respectively. A substitution S, denoted by the sequence of 
type substitutions and row substitutions, is defined as a composition of extended 
type substitutions and row substitutions: if S' = S' p then S(cr) = S'{p{a)) or 
if S = S'ip then S{cr) = S'{ip{a)), where S' possibly be an empty sequence. 
The domain of S is the set of variables defined as the union of all domains of 
the components. We write S o S' for the composition of S and S', defined as 
(S o S')(cr) = S(S'((t)). a ground substitution is a substitution S such that for 
any a G cod{S), a contains no free variables. We exclude any substitutions which 
collapse legal context types in the rest of the development. 

The set of terms is given by the grammar: 

F ::= {xi : Ti, . . . ,Xn ■ r„} Interfaces 

e ::= a; | c** | Xx.e \e e \ letc a; = e in e | X'^ \ SX.e \ eO^-c Terms 

c** is a constant which has a base type b. is a hole. V on the shoulder of hole 
X denotes the set of variables which does not exported i.e. those variables which 
will be used locally in the context containing X. SX.e denotes a context built 
up by the hole abstraction. Ci 62 is a context application. 1 denotes the set 
of variables and their (closed) types which is exported by ci and imported by 
62 during the computation of the context application. Note that the variables in 
domfl) are regarded as bound in the term ei(-)^e2- 
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Var %,T \> X \ T ifT (x) ^ r 



Const 0, T l> c*” : 6 



{Xj : [pit>Ti//pi-X : ti>»]},T \±} {X ■. Ta} > e : n ^ . 

{Xi : [pi>Ti//pi',x : -t> »]},T [> Xx.e : Ta ^ n 



App 



Let 



Til, T > ei : Tl — > T2 Ti.2,T t> 62 ■ Tl 

Til W 7^2, T l> ei 62 : T2 

Ti, T l> 6i : a {Xj : [pi > Tj/p'; a; : t> »]},T tel {x : a} l> 62 ■ t 



H W {Xi : [pi > Ti//p[- X : ->•]}, T > let^o a; = ei in 62 : t 
if Li = — , or for some <ti X cr if there exists S such that S'(cro) = S{ai), a = [cri] 

Hole {X : [p; Xi : ->t//p; Xi : •]}, T > : t 



HAbs 



CApp 



Ti l±l {X : [pi t> Ti//p 2 t> »]},T > e : T2 
'H,T 6 X.e : [pi t> Ti /p2 > T2] 

Ti,Ti>ei : [pa> Tg// pt> n] {Xj ■. [pi n// pgi> •\},T \bT' \> 62 ■ Xg 
H W {Xi : [pi [> TiUpb 0 •]}, T > 6103^62 : Tb 

if El{[pa>»//Pb>*]) 2 [T\=I 



'hL '7 C> 6 ' T 

Gen ’ — ^ — if cr = Gen{H, T, t) 

ri, I [> e (J 



Fig. 1. Typing rules 



The type system is defined as a proof system deriving a typing of the form 

7 i, T > e : (T 

which indicates that e has type scheme cr under a variable type assignment T 
and a hole type assignment 7 i which is a function assigning types [pt> t // p' •] to 
labeled holes. We assume that any T and 7 i do not contain illegal type schemes 
in the rest of the development. The set of typing rules is given in Figure 

The generalization Gen{Ti.,T,a) = V^.cr where ^ are all free variables of 
a that do not occur in H and T. We write cr 0 cr' when cr = V^i • • 
o' = such that each is not free in • • • C^.r' and t = S(r') for 

some S such that dom{S) = {^1, . . . Let T = {xi : Oi\. We write [T] for 

{xi : [Oi]}. 

Some explanations of the typing rules are in order. 

— Rule Abs. We require that all rows in the right side of the context type of 
any hole in the hole type assignment contain x. Then the rule changes x-field 
to and also discharges {x : Ta} from the type assignment. 

— Rule Let. The annotation erg associated with let specifies the generality of 
the type of variable x which will be captured through the possible context 
application. We can omit this burden in the following cases: 

• 62 has no hole, 

• cr is a monomorphic type, and 
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• cr itself is programmer’s choice for the type of x exported through the 
holes in e^. 

In other cases, this annotation is needed to keep principality of the type 
system of ML. 

— Rule Hole. keeps row input unchanged when it is not surrounded by 

any binding constructs. Xi are excluded from the hole type assignment by 
assigning to them. 

— Rule CApp. 62 expects that ei provides the bindings for the free variables in 
dom{T), and I also provides the type informations for the variables which 
will be captured by ei. Note that ei may export more bindings than 62 
requires. 

In our system, each free hole occurs linearly as in |B| . If multiple occurrences 
of a hole are allowed, we must maintain the information of the bindings exported 
through each occurrence of a hole. The linearity condition is ensured by the rules 
App, Let, Hole and CApp. 

The following lemma shows that typing judgments are stable under substi- 
tution. 

Lemma 1 . //7f,T \>e : a, then for any substitution S , S (H) , S (T) \> e : S{a). 

Proof. By simple induction on the derivation oi Tt,T \> e : a. □ 

Note that S is not applied to the term e. We must consider variables occurring 
in the type annotations in e as closed ones quantified by existential quantifiers. 

3.2 Dynamic Semantics 

We give a call-by- value operational semantics of the system in the style of natural 
semantics^ by giving a set of rules to derive a reduction relation of the form 

?7,C b e II u 

indicating that e evaluates to a value v under a hole environment ij, which is a 
function from a finite set of holes to values, and a variable environment which 
is a function from a finite set of variables to values. The set of values is given 
below. 



v ::= I func{r], x, e) | contfq, X, e) | closfq, V, e) | wrong 

func{r],f,x,e) and cont{rj,C,,X,e) indicate a, function closure and a context clo- 
sure representing a function value and a context value, respectively. clos(r], V, e) 
indicates a term closure representing a value corresponding to an open term to 
be filled in the suitable contexts. This value can be regarded as a generalization 
of a function closure; if V = {xi, . . . , Xn}, it takes a set of values one for each Xi, 
and evaluates e under the hole environment g and the variable environment ob- 
tained from C by extending with the bindings for {x \, . . . , Xn}- wrong represents 
the runtime error. 
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Var 77, h a: JJ. w if C(®) = 'i' Const 77, ^ h c*" c*" 



Func 77, ^ h Ax.e JJ. func{rj, x, e) 

^ rj,(^ ei i}. func{r]',C,x,e') 77, C F 62 t>' ^ v'} e' ij. v 

77, C 1“ ei 62 -IJ. ti 

?7, C l~ ei -IJ- 7;' 77, ({x 1-^ 7;'} F 62 -Ij- f 

77, C h let<j a: = 6i in 62 JJ- v 



Hole 



V',C{Xi ^ C(^ 7 )} F 6 77 



if r,(X) 



cZos(77',C'.{®i}.e) 



Cont 77, h SX.e Jj- cont(ri, X, e) 

rj,( \- ei ii. cont{r]' X' ,X,e') r]'{X 1-^ clos{r]X,dom{I), 62)}, C ^ e' i}. v 

’’’’ T7,CF 610^62 -li V 



Fig. 2. Evaluation rules 



The set of rules for the call- by- value operational semantics is given in Fig- 
ure 0 This set of rules should be taken with the following implicit rules yielding 
wrong: if the evaluation of any of its component specified in the rule yields 
wrong or does not satisfy the side condition of the rule then the entire term 
yields wrong. This operational semantics is readily implement able. 



3.3 Type Soundness 

We extend the set of types with auxiliary types of the form [I > r] for term 
closures. A value v has a closed type scheme a, denoted by ^ u : cr, if it is 
derivable from the rules given in Figure El 

Let 7i,T \> e : cr be a derivable typing judgment. A hole environment 77 and 
a variable environment ( satisfy the typing judgment T > e : a, denoted by 
ivX) h > e : cr), when the following conditions hold. 

— For all X G dom(T), |= C(2;) : T^x). 

— If the last rule which is used in the derivation of T > e : cr is CApp, 

Ho W {Aj : [pi [> nil pb [> •]},'T > 6102-62 : n is derived from > ei : 

[pa > Tallpb > n] and {Xi : [pi > nil Pa > •]}, T l±l T' > 62 : r^. Then, 

• for all X G dom{Ho), if Ho{X) = [•or/* o*] then ^ p{X) : [I' o r], 
where I' C EI{Ho{X)) l±l [T\py(^g(x))]. 

• And for all Xi,\= g{Xi) : [X^ o Ti], where X* C [T'\pv([pi^Ti // W 

EI{\pi O Till Pa 1> •]) W [T\py([p.>Ti/'p(,i>»])]- 

Otherwise, for all X G dom{H), ifH{X) = [•or/*o*] then \= 77 (A) : [Xor], 
where X C EI{H{X)) 1 +) [T\pv(n{x))]- 
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h=c*' : b 

1= V : [cr] 4^ 1= n : cr 

1 = func{ri, X, e) : n — > T2 for all v such that ^ n : ri, 

if Tj, «} h e Jj. n' then |= n' : T2 

1 = clos{ri, {xi}, e) : [{xi : r^} [> t] for all Vi such that \= Vi : n, 

if rj, <^{xi I— > Vi} \- e v' then \= v' ■. t 

\= cont{ri, X, e) \ [pi > ri/p2 > T2] 4 => for all X such that X C EI{[p\ > •// P2 > •]), 

for all V such that ^ n : [T > ri], 
if p{X n}, C[ h e v' then |= n' : T2 

1 = V : V^i • • • 4 => for any ground substitution S such that 

dom{S) = {^1, . . .Cn},h V : S(r). 

Fig. 3. Definition of value typing 



With these preparations, we prove the following soundness theorem which 
denotes that a well-typed program will not produce a runtime error. The proof 
is deferred to the appendix. 

Theorem 1 (Type Soundness). // 7f,T > e : a is derivable, then for any 
ground substitution S such that dom{S) contains all the free variables in the 
derivation ofH,T \>e : a, and for any r],f such that ( 77 , C) H {S{T-L,T \>e : a)), 
C e fj. V then |= 7 ; : 5 (cr) . 

4 Type Inference 

In this section, we develop a type inference algorithm. 

In the algebraic point of view, the definition of context types can be regarded 
as a combination of the record type appeared in H3|, and locked polytype ap- 
peared in |S|. First, we define a first-order unification for our type system using 
these techniques. 

Solving unification problems is formulated as the transformation of unifica- 
tion problems defined as follows. 

U ::= _L I T I f7 A f7 | 3a.U \E\a = a Unification problems 
E ::= Et \ Ep Multi-equations 

Er ::= t\t X Er Multi-equations for types 

Ep ::= p \ p = Ep Multi-equations for rows 

The symbol T denotes a unification problem that has no solution, and T denotes 
a trivial unification problem. We treat them as a unit and a zero for A . That 
is [/ A T and [/AT are equal to U and T, respectively. We also identify T 
with singleton multi-equations. That is, we can always assume that U contain 
at least one multi-equation a = E for each type variable of U and at least 
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one multi-equation x = for each row variable of U . A complex formula is 
the conjunction of other formulas or the existential quantification of another 
formula. The symbol A is commutative and associative. The symbol 3 acts as 
a binder. The symbol = is commutative and associative. 

A substitution S' is a solution of a multi-equation E for monotypes, if re- 
sults of applying it to all terms of E coincide, and S is a solution of cr = cr' 
if S((t) = S{a') where = is modulo reordering and renaming of bound vari- 
ables, and removal of redundant universally quantified variables. We can say 
V^.r = V^'.r' if and only if there exists a substitution S such that 

1. S(r) = S(r') 

2. S I j and S | are injective in ^ U 

3. no variable of f U appears in S\(juj/p 

We introduce another kind of unificands ^ ^ whose solutions are substitutions 
satisfying the above conditions 2 and 3. We assume f n = 0 in order to 
avoid unnecessary complexity. The symbols = in polytype equations and <-> are 
commutative. Two unification problems are equivalent if they have the same set 
of solutions. 

Given a unification problem U, we define the constraint ordering -<u as 
the transitive closure of the immediate precedence ordering containing all pairs 
a ^ a' such that there exists a multi-equation a = t = E in U where t is a 
non- variable term that contains a'. A unification problem is strict if A j/ is strict. 

A unification problem is in solved form if it is either T or T, or if it is strict 
and of the form 3^. Viei n where Ei contain at most one non- variable term, 
and if J yf j then Ei and Ej contain no variable term in common. 

We write U ^ 3^.5' if S' is a principal solution of U and variables ^ are not 
free in U, and U ^ T if C/ is unsatisfiable. We write \<j\ for the size of cr: the 
number of occurrences of symbols _ — > a; : _, or [_>_//_>_] in cr. We use 

the notation |_| also for rows by abuse of notation. 

The unification algorithm is given as a set of rewriting rules that preserves 
equivalence in Figure^ There are implicit compatible rules that allow to rewrite 
complex formulas by rewriting any of sub-formula, and crash rules that yield T 
when the top symbols in both sides of = are not the same symbol. 

The rules Occur,Merge, Absorb and Decompose-Fun are for the ordinary first- 
order unification. The other part of the system includes the rules for polytypes 
Decompose-Poly, Polytypes, Renaming-True and Renaming-False which are origi- 
nated from the first-order unification for polytypes in 0, and also includes the 
rules for rows Decompose-Row,Mutate-Absent and Mutate-Field which are from 
The rule Decompose-Cont naturally decomposes context types. 

Theorem 2. Given a unification problem U , the set of rules in Figure ^ com- 
putes a most general unifier S, or fails with computing T. 

Proof (Sketch). By showing the right and the left side of each rule have exactly 
the same set of solutions. The termination of the algorithm can be shown by the 
induction on the appropriate weight. □ 
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Occur 

if is not strict then 17 ^ _L 
Merge 

^ = E A ^ = E' ^ ^ = E = E' 

Absorb 

i = i = E ^ (, = E 

Decompose- Fun 

if |ri T2| < \t[ T2\ then 

Tl ^ T2 = t[ ^ T2 = E ^ T\ ^ T2 = E A Tl = t[ A T2 = T 2 

Decompose- Poly 

if |cr| < |cr'| then [<t] = [rr'] = E ^ [o'] = E A a = a' 

Decom pose-Cont 

if |[pi >ri/p2 >r2]| < \[p'if>n//p2t>T2]\ then 

(ti=t[ 

[pi > Ti/p 2 > T 2 ] = [pi > ri/pi > ri| = £; ^ [pi > n/p 2 > T 2 [ = Fi A /\ < ^ ^ 

I p2 = pi 

Decom pose- Row 

if [pi; a; : ti[ < \p2',x : i2\ then 

pi-,x : Li = p2’,x : b2 = E ^ p\\x\ii^E A pi = p2 A ti = t2 

Polytypes 

let ^ n f ' = 0 and ^ n EV (r') = 0 and n -FIF (t) = 0 in 
VC.t = VC'.t' ^ = a 

Renaming-True 

let 5 and f ^ ^,)iei...n ^ ^ ^ y 

Renaming-False 

if /3 e ^ and r ^ f ' U {/9} then j3 = t = E A ^ _L 

if /3 € ^ n F’V' (r) and r ^ j3 then j3' = t = E A i ^ ^ -L 

Mutate- Absent 

p\x : i = - = E ^ —= E A t = -A p = - 

Mutate-Field 

if [pi;* : ti[ < [p2;* : t2[ then 

pr,x ■. Li = p2\y ■■ L2 = E ^ 3 x-Pi; ® : '-i = A pi = x;p:t2 A p2 = X',x:i.i 



Fig. 4. Rules for first-order unification 
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A typing problem is formulated as a triple T[>e : r. A solution of a typing 
problem is a substitution S such that S{H), S{T)[>e : S{t). The set of solutions 
of a typing problem is stable under substitutions by Lemma ^ Therefore we 
can treat typing problems as unification problems. Uniform treatment of type 
inference and unification enables us to prove the theorems in the simpler way as 
described in M- 

We use the operator (_) to unlock monomorphic poly types. Its effect is defined 
by ([cr]) = a and (r) = r if r is not [a]. Let I be {a;i : ti, . . . , : r„}. We write 

(p;I) for {p-,xi : Ti\...]Xn : r„), {p',T~) for {p\Xi a;„ : -) and (I) for 

{x\ : {ti),. . . ,Xn : (tVi)} respectively. The rules for solving typing problems are 
given in Figure^ The following theorem shows that the type inference algorithm 
is sound and complete. We can prove it in the similar way to Theorem due to 
the uniform treatment of unification and type inference. 

Theorem 3. Given a typing problem 7i,T[>e : t, the set of rules in Figures |7I 
andQ computes a principal solution S, or fails with computing T. 



5 Conclusion 

In 0, Hashimoto and Ohori have extended the simply typed lambda calculus 
with the feature of first-class contexts. In such a calculus, binding effects origi- 
nally introduced by function abstraction can be propagated through the holes. 
In this previous work, the type system can precisely describe such effects in the 
simply typed setting. 

In the present paper, we have developed the basis for developing a practi- 
cal programming language enjoying the benefit of first-class contexts by giving 
an ML-style polymorphic type deduction system, an operational semantics and 
soundness of the type system and a sound and complete type inference algorithm. 
Using this language we can represent various useful features such as simple type- 
safe macros, linking modules, programs for distribution, and so on. There are 
several interesting issues that merit further investigation. We briefly mention 
some of them. 

Efficient implementation. Based on this work, we have implemented a experi- 
mental interpreter system on a Standard ML system. However, there is a plenty 
of room for improvements including the representation of closures and the type 
checking algorithm. We believe that a number of existing compilation techniques 
for records can be utilized for the purpose. 

Extension to second-order system. In the present system, contexts are only able 
to export the bindings for the term variables. As seen in the module systems, 
however, the definitions of types are also exported in the realistic applications. 
If the system is extended to second-order system to be able to provide bindings 
for types, applicability of contexts for the applications in such areas will be 
increased. 
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let V^.t' = T(a;) and ^ n F’V^(r) = 0 in 7t, Tt>® : r ^ 3^.(r = r') 



let dom{H) = {Xi} and a, 13, 7 i, Xi, t FV{H) U FV{T) U FV{r) in 

{ {Xi : [xi > 7i/f7Ti > •]}, T W {a; : a}t>e : (3 

l~=t7x^.a 

H = {Xi: : ->•]} 



let a ^ FV{H) U FV{T) U FV{t) and H = Hi W H 2 in 
H,Tt>eie 2 : t ^ 3a.(Hi,Tt>ei : a ^ t) A {H 2 ,T[>e 2 : a) 



let n = Hi W H 2 and dom{H 2 ) = {X,} and /3i, Xi, tti, i/> ^ FV{H 2 ) U FV{T) 

and = FV{to) and ^1 be a copy of ^1 outside of H and T and n = {^i/^o}to 

and (TO = V6-ro and ^ C (Hy(H) U Hy(T)) = 0 in 

ifHi,Tt>ei : n ^ 3^^'.5' then let (t = Gen(5'(n), S(H), ^(T)) in 

if ^ n {dom{S) U FV (cod{S))) = 0 then 

H, T t>leto-o ® = ei in 62 : t 

1® 

''|7Ti=i/);®: IVC-n] 

[ H 2 = {Xi : [xit> I3i//ip-,x : ->•]} 
else H,Tt>leto-o a; = ei in 62 : r ^ ± 



let H = {X : [pi > t' H p 2 > •]} and x ^ ([pi > F jj p 2 > •]) in 

H, T : r ^ 3x-r = t A pi X p2 = x\Xi : - 



let a, f3,x,T^^ FV{H) U FV{T) U FV{t) in 

H, T t>(5X.e : T ^ ^aPx'x.'H W {X : [x > a/ir > •]}, Tt>e : (3 A r = [x > a/ir > /3] 

CApp 

let H = Hi'S H 2 and dom{H 2 ) = {Xi} 
and a, f3i, x, tt, Xi, aii 0 HF (H) U FV{T)S FV{r) in 
ifHi,Tt>ei : [x',F t> a/fn-,! >r] ^ 3^.5' then 
let I' = EI{[S{x-,T)'>»//S{tt-I-)\>»]) in 

OV -T-rA ^ o A J ^ Fll'Xi > •]}, ^ w {I')>e 2 : a 

H,Tl>ei©2-e2 : r ^ 3^a/liX7rXi7ri. /\ < ^ ^ 

[ H 2 = {Xi : [xi >/3i/7r;T“l>»]} 



Fig. 5. Type inference algorithm 
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Relationship with modal operators. In our system, first-class contexts can provide 
a simple macro system in a type-safe way. We can provide macro arguments by 
hole filling. However, simple hole filling causes generation of unefficient code 
in general. For the purpose of generating efficient code, we need a notion of 
computation stages. Integration of our system and the feature of explicit code 
generation using modal operators seen in ML with modality uni will produce 
a more interesting system. 

Relationship with other systems. There are several formal systems for manipu- 
lating names and bindings: Ait-Kaci and Garrigue’s label-selective A-calculus | 2 |, 
Dami’s AA ^ and Acr-calculus of Abadi et.al. PJ. Although none of them can 
directly represent the notion of first-class contexts, similar features appear to 
be representable in those system. The precise relationship between the context 
calculus and those calculi would be interesting further investigation. 
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A Proof of the Main Theorem 

Theorem ^ (Type Soundness) If TC,T t> e : a is derivable, then for any 
ground substitution S such that dom{S) contains all the free variables in the 
derivation of H,T \> e : a, and for any r],C, such that {r],Q \={S(fH,T\>e : a)), 
if ^ -ll t! then |= u : S' (cr) . 

Proof. Let S be a ground substitution which satisfies the condition in the the- 
orem. We proceed by induction on the derivation of 7i, T > e : a. The case of 
Const is quite easy. 

Case Var 0,T > x : r (T(x) ^ r) : 

Suppose {r], () \= (S(0, T \> x : r)) and r],<4 \~ x v. Let T{x) = V^i • • • fn-To- 
Then \= v : S{T{x)) = V^i • • • ^„.S(to) by the definition of satisfying and 
the bound variable convention. Since T(x) ^ r, there is some Sq such that 
dom{So) = {^ 1 , . . . ,Cn} and So(ro) = r. Then S(r) = S(So(to)) = (SoSo)(S(to)) 
by the bound variable convention. Since dom{S) D FV{t), S o Sq is ground. 
Therefore \= v : {S o Sq){S{to)) by the definition of value typing. 
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{X, : [pi t> TiHp[-, x: ijt> «]},r i±l {a: : Tg} > e : Tb 
{Xi : [pit>n// p\]x : ->•]}, T > \x.e : Ta ^ n 
(ii = Ta or ii = -) : 



Suppose (77, C) h (S'({Xi : [pi t>Tijjp'f,x : ->•]}, T > Xx.e : Ta T&)). Then for 
all X € dom{T), [= C(a;) : S{T{x)) and for all Xi, |= r]{Xi) : [Ji t> S'(ri)] where 
li C EI{S{[pit>u//p'p,x : -o*]))l±l [S'(T) |dom(T)\py(S([pi>.//p';a;:-t>.]))]- Since for all 
Xi, the difference between EI{S{[pit> n// p'p, x : iii>»])) and E I {S {[pi > Ti // p'p, x : 
->•])) is at most {x : S'(Ta)}, for any v ' such that |= v ' : S{ra), we can derive 
{r],C{x 1-^ v }) \= {S{{Xi : {p^\> p\\x : t* c> •]}, T l±) {a; : Ta} > e : Th)) by the 
definition of satisfying. Hence, if rjX{x 1-^ b e }1 u', then \= v ' : S{Tb) by 
the induction hypothesis. By the definition of value typing, ^ func^rj, C., x, e) : 
S'(Ta) ^ S{Tb). 



Case 



App 



Hi,T t> €i : Ti — > T2 Ti. 2 , T > 62 : n ^ 

Hi l±l H2, T > Cl 62 : T2 



Suppose (?7, C) \= {S{Hi ^H2,H > ei 62 : T2)). By the definition of satisfying, 
(77, C) h {S{Hi,T > 6i : Ti ^ T2)) and (77,0 h {S{H2,T > 62 : ti)). If 77, C b 
6i 62 IJ- u then for some ri',(^',e' and v', we have 77, C b ei 1} func{ri' X' ,x,e'), 
?7, C 62 IJ- v' and 77',C'{a; : v'} b e' 1} u by the evaluation rule App. Then by 
the induction hypothesis, |= func{rj' , C,' , x, e') : S{ti T2) and |= v' : S{ti). The 
rest of this case is by the definition of value typing. 

H,T > 61 : g {X^ : {p^\> p\\x : 7, [>•]}, T i+l (a: : crj > 62 : r 
H l±) {Xi : [pi t>TiH p [\ T : - o •]},T > let^o x = 61 in 62 : r 
{ii = -, or for some cti ^ cr if there exists Sq such that S'o(cro) = S'o(o’i), 
ti = [cti]): 

Suppose (t 7,C) h (S'(^bl{A:i : : ->*]},T>letao a: = 61 in 62 : r)). 

Obviously, (77, C) \= {S{H,T > 61 : a)) by the definition. If 77, b letao x = 

61 in 62 -IJ- V , then 77, C b ei -Ij- v ' and 77, C{a: v '} b 62 -Ij- u by the evaluation 

rule Let. By the induction hypothesis applied to 77, ^ b 61 -Ij- v ' and (77, Q) |= 
(S'( 7 i, T > 6i : a)), we have \= v ' : S{a). Since for all Xi, the difference between 
EI{S{[pi>Ti//p'p,x : ti>*])) and EI{S{[p^>Ti// p'f, x : -!>•])) is at most {x : S'([(Ti])}. 
By the definition of value typing, ^ vq : S{a) if and only if |= uq : since 

S is ground. Therefore, for all Xi, if |= rj{Xi) : [{..., a; : S'([cti]), . . .} > n] then 
1 = r]{Xi) : [{..., a; : S'([(t]), . . .} > r,]. Then, we can derive (77, C{a^ |= 

(S'({Ali : [pii>ri///9';a; : ii[>*]},Tl±l{a; : (j}> 62 : r)) by the definition of satisfying 
similarly to the case of Abs. Then ^ u : S'(r) by the induction hypothesis. 

Case Hole {X : [p',Xi ■. - \> t jj p-,Xi : ->•]}, T > X^^'^ : r : 

Suppose (77, C) h (S'({A1 : [p;x^ : - >T(jp-,x^ : -i>*]},T \> X^^'^ : r)). Then 
for any x € dom{T), ^ ({x) : S{T{x)), and |= r]{X) : [ 1 1> S'(r)] for some 

2i C [T \do7n(r)\{xi}]- Let {yi} = dom{I). If 77, C b Jj. v, then for some 

77', C' and 6, we have r]{X) = closer]' ,C , {yi},e) and 77', C(yi)l b e -Ij- u 
by the evaluation rule Hole. Since {77^} C dom{T), we have |= C{yi) ■ S{l{yi)). 
Then by the definition of value typing, \= v : S{t). 
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Case 



HAbs 



H i+l {X : [pi > Till p2 > •]}, T > e : T2 
Ti,T t> 6 X.e : [pi > Ti//p2 t> T2] 



Suppose (? 7 ,C) h \> 5 X.e : [pi t> ri/p2 t> T2])) and 77, C 1 “ 5 X.e -IJ. 

cont{riX, X,e). Let J = EI{S{[pi > •Hpi > •])). For all v and X' such that 
I' C X and ^ u : \X' t> S'(ri)], we have {r]{X ^ u},C) \= {S{Xl l±l {X : 
[pi [> Ti /p2 [>•]}, IF >e : T2)) by the definition of satisfying. If rj{X 1-^ v} \~ e U- v' , 
we have |= v' : S{t2) by the induction hypothesis. Then ^ cont{ri,C,,X,e) : 
[pi > S{ti)//p2 t> <S'(r2)] by the definition of value typing. 

H,T\>ei : [pa>Ta//pb>n] 

Case CApp {Xj : [pi o Till Pa t> »]},T i+l T' > 62 : Tg : 

H 1+) {Xi : [pi >nllpb > »]},T > 6102-62 : n 
{EI{[pa>.//pb>*])X[T]=X) 



Suppose (77,0 h {S{nCi{Xi : [pi n // pb •]} ,X >61 ©262 : n)). If 77, C F 
6102 62 'll V, then for some 77', and e', we have 77, ^ h 61 II cont{q' , X,e') 
and 77 '{AT clos{q,C,,dom{X),e2)},C' F e' II u by the evaluation rule CApp. 
Since (77, C) \= {S{H,T > Ci : [pa> Ta// Pb> Tb])) by the definition of satisfying, 
1 = cont{p' , X, e') : S{[pa >Tajlpb> Th]) by the induction hypothesis. Therefore 
for all X' such that X' C EI{[pa t> Tall Pb '> fb]), for all v' such that \= v' : \X' c> Ta], 
if ri'{X ^ F 6 ' 'll v”, then \= v" : Tb by the definition of value typing. 

Let Vi be values such that ^ Vi : S{T'{xi)) where dom{X) = {xi\. Since for all 
Xi, EI{S{[pi\>TiH Pb\>»])) c EI{S{[pi\>TiH Pa>»]))^ s([T]) by the definition of 
satisfying, we have (77, C{a:i 1-^ Ui}) |= {S{{Xi : iFl+lT' >62 : Ta)). 

Therefore by the induction hypothesis, if 77, Vi\ F 62 II v'" , then |= v'" : 

S{to)- Then |= clos{rj, C? dom{X), 62) : [S'([T']) c> S'(Ta)] by the definition of value 
typing. By the definition of value typing, if rj'{X clos{r]Xjd,om{X), 62)}, C F 
e' 'll V, then |= v : S{Tb). 

Case Gen ^ ^ ^ ^ (a = Gen(H,X ,t)) : 

rt, I \> e : O' 



Suppose (77, C) 1 = {S{H,T > e : cr)). Then there are some ^i,...^n such that 
cr = V^i • • -^n-T. By the bound variable convention, we can assume that none 
of {^1, . . .^„} appears in S. Then S'(cr) = V^i • • •^„.S'(t). Let S' be any ground 
substitution such that dom{S') = {Ci, • ■ • ■Cn}- S" o S' is ground. If 77, h e H v, 
then \= V : S'(S(r)) by the induction hypothesis. □ 
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Abstract. A formal method is a mathematically-based technique used 
to describe properties of hardware and software systems. It provides a 
framework within which large, complex systems may be specified, de- 
signed, analyzed, and verified in a systematic rather than ad hoc manner. 
A method is formal if it has a sound mathematical basis, typically given 
by a formal specification language. 

In my talk I will review the seeds and early development of formal meth- 
ods. Citing notable case studies, I will survey the state of the art in 
the areas in which researchers and engineers most recently have made 
the greatest strides: software specification, model checking, and theorem 
proving. To close, I will suggest future research directions that should 
result in the most promising payoffs for practitioners. 
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Abstract. In recent years, several semantics for place/transition Petri 
nets have been proposed that adopt the collective token philosophy. 
We investigate distinctions and similarities between three such mod- 
els, namely configuration structures, concurrent transition systems, and 
(strictly) symmetric (strict) monoidal categories. We use the notion of 
adjunction to express each connection. We also present a purely logi- 
cal description of the collective token interpretation of net behaviours in 
terms of theories and theory morphisms in partial membership equational 
logic. 



Introduction 

Petri nets, introduced by Petri in fZj (see also m), are one of the most widely 

used and representative models for concurrency, because of the simple formal 
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description of the net model, and of its natural characterisation of concurrent 
and distributed systems. The extensive use of Petri nets has given rise to different 
schools of thought concerning the semantical interpretation of nets, with each 
view justified either by the theoretical characterisation of different properties of 
the modelled systems, or by the architecture of possible implementations. 

A real dichotomy runs on the distinction between collective and individual 
token philosophies noticed, e.g., in (Sj. According to the collective token phi- 
losophy, net semantics should not distinguish among different instances of the 
idealised resources (the so-called ‘tokens’) that rule the basics of net behaviour. 
The rationale for this being, of course, that any such instance is operationally 
equivalent to all the others. As obvious as this is, it disregards that operationally 
equivalent resources may have different origins and histories, and may, therefore, 
carry different causality information. Selecting one instance of a resource rather 
than the other, may be as different as being or not being causally dependent on 
some previous event. And this may well be an information one is not ready to 
discard, which is the point of view of the individual token philosophy. 

In this paper, however, we focus on the collective token interpretation as 
the first step of a wider programme aimed at investigating the two approaches 
and their mutual relationships in terms of the behavioural, algebraic, and logical 
structures that can give adequate semantics account of each of them. 

Starting with the classical ‘token-game’ semantics, many behavioural models 
for Petri nets have been proposed that follow the collective token philosophy. In 
fact, too many to be systematically reviewed here. Among all these, however, a 
relatively recent proposal of van Glabbeek and Plotkin is that of configuration 
structures [0|. Clearly inspired by the domains of configurations of event struc- 
tures m, these are simply collections of (multi)sets that, at the same time, 
represent the legitimate system states and the system dynamics, i.e., the tran- 
sitions between such states. One of the themes of this paper is to compare con- 
figuration structure with the algebraic model based on monoidal categories HH, 
which also adopts the collective token philosophy and which provides a precise 
algebraic reinterpretation of yet another model, namely the commutative pro- 
cesses of Best and Devillers [p. In particular, we shall observe that configuration 
structures are too abstract a model, i.e., that they make undesirable identifica- 
tions of nets, and conclude that monoidal categories provide a superior model 
of net behaviour. 

To illustrate better the differences between the two semantic frameworks 
above, we adopt concurrent transition systems as a bridge-model. These are 
a much simplified, deterministic version of higher dimensional transition sys- 
tems P] that we select as the simplest one able to convey our ideas. Concurrent 
transition systems resemble configuration structures, but are more expressive. 
They also draw on earlier very significant models, such as distributed transition 
systems jOj, step and PN transition systems PH, and local event structures |S|. 
Moreover, the equivalence of the behavioural semantics of concurrent transi- 
tion systems and the algebraic semantics of monoidal categories can be stated 
very concisely. As we explain also in this paper, the algebraic semantics is itself 
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amenable to a purely logical description in terms of theories in partial member- 
ship equational logic m- 

The main result of this research is a new precise characterisation of the rela- 
tionships between all these behavioural, algebraic, and logical models within the 
collective token philosophy. We show that Best-Devillers commutative processes, 
the algebraic monoidal category model, and the concurrent transition system be- 
havioural model all coincide in the precise sense of being related by equivalences 
of categories. And we also show how the behavioural model afforded by configu- 
ration structures is too abstract, but is precisely related to all the above models 
by a natural transformation that characterises the identification of inequivalent 
nets and behaviours caused by configuration structures. 

The structure of the paper is as follows. In Section Q] we recall the basic 
definitions about PT Petri nets, remarking the distinction between the collec- 
tive and individual token philosophies, and we introduce the frameworks under 
comparison, i.e., configuration structures, concurrent transition systems, and 
monoidal categories (also in their membership equational logic characterisation), 
discussing for each of them the corresponding models that they associate to a 
Petri net. Section El and Section 01 compare concurrent transition systems with, 
respectively, monoidal categories and configuration structures. Finally, the con- 
cluding section describes related work on the individual token philosophy. 

1 Background 

1.1 Petri Nets and the Collective Token Philosophy 

Place/transition nets, the most widespread flavour of Petri nets, are graphs with 
distributed states described by (finite) distributions of resources (‘tokens’) in 
‘places’. These are usually called markings and represented as multisets u: S 
N, where u(a) indicates the number of tokens that place a carries in u. We shall 
use n{S) to indicate the set oi finite multisets on S, i.e., multiset that yield a zero 
on all but finitely many a € S. Multiset union makes fJ-(S) a free commutative 
monoid on S. 

Definition 1. A place/transition {PT for short) Petri net N is a tuple (5o,i9i, 
S,T), where S' is a set of places, T is a set of transitions, do,di:T /r(S) are 
functions assigning, respectively, source and target to each transition. 

Informally, do{t) prescribes the minimum amount of resources needed to en- 
able t, whilst di (t) describe the resources that the occurrence of t contributes to 
the global state. This is made explicit in the following definition, where we shall 
indicate multiset inclusion, union, and difference by, respectively, C, -|-, and — . 

Definition 2. Let u and v be markings and X a finite multiset of transitions 
of a net N. We say that u evolves to v under the step X, in symbols u [A) v, if 
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Fig. 1. 



the transitions in X are concurrently enabled at u i.e., ’ ^o(t) ^ u, 

and 

f = M + Et6T«^W • {di{t) -do(t)). 

A step sequence from uq to is a sequence uq [X\) ui...Un-i [A„) m„. 

PT nets are often considered together with a state: a marked PT net N is 
a PT net {do,di, S,T) together with an initial marking uq € p-{S). In order to 
equip PT nets with a natural notion of morphism, since that p{S) is a monoid 
under + with unit 0, we consider maps of transition systems that preserve the 
additional structure. 

Definition 3. A morphism of nets from N = {do, di, S, T) to N' = {d'o, d[, S", T') 
is a pair {ft, fp) where fp- T ^ T' is function, fp\ p{S) P-{S') is homomorphism 
of monoids such that 9' o /( = /^ o 9j, for i = 0, 1. A morphism of marked nets 
is a morphism of nets such that fp{uo) = Uq. 

We shall use Petri (respectively Petri*) to indicate the category of (marked) 
PT nets and their morphisms with the obvious componentwise composition of 
arrows. 

To compare the effects of the collective and of the individual token philoso- 
phy on observing causal relations between fired transitions, let us consider the 
example in Figure Q that we adapt from 0. (As usual, boxes stand for tran- 
sitions, circles for places, dots for tokens, and oriented arcs represent do and 
9i.) 

Observe that the firing of t produces a second token in place b. According 
to the individual token philosophy, it makes a difference whether t' consumes 
the token b originated from the firing of t, or the one coming from the initial 
marking. In the first case the occurrence of t' causally depends on that of t, and 
in the second the two firings are independent. In the collective token philosophy, 
instead, the two firings are always considered to be concurrent, because the firing 
of t does not change the enabling condition of t'. 



1.2 Configuration Structures 

In the same paper where they introduce the distinction between collective token 
and individual token philosophy, van Glabbeek and Plotkin propose configura- 
tion structures to represent the behaviour of nets according to the collective 
token philosophy. These are structures inspired by event structures m whose 
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dynamics is uniquely determined by an explicitly-given set of possible configura- 
tions of the system. However, the structures they end up associating to nets are 
not exactly configuration structures. They enrich them in two ways: firstly, by 
considering multisets instead of sets of occurrences, and secondly, by using an 
explicit transition relation between configurations. While the first point can be 
handled easily, as we do below, the second one seems to compromise the basic 
ideas underlying the framework and to show that configuration structures do 
not offer a faithful representation of the behaviour of nets under the collective 
token philosophy. 

Definition 4. A configuration structure is given by a set E and a collection C 
of finite multisets over the set E. The elements of E are called events, and the 
elements of C configurations. 

The idea is that an event is an occurrence of an action the system may 
perform, and that a configuration X represents a state of the system, which 
is determined by the collection X of occurred events. The set C of admissible 
configurations yields a relation representing how the system can evolve from one 
state to another. 

Definition 5. Let {E, C) be a configuration structure. For X, Y in C we write 
A — > y if 

(1) ^cr, 

(2) Y — X is finite, 

(3) for any multiset Z such that X C Z CY, we have Z G C. 

The relation — > is called the step transition relation. 

Intuitively, X — > Y means that the system can evolve from state X to 
state Y by performing the events inY — X concurrently. To stress this we shall 
occasionally write X — ^ Y, with L = Y — X . Observe that the last condition 
states that the events in Y — X can be performed concurrently if and only if 
they can be performed in any order. In our opinion, this requirement embodies 
an interleaving-oriented view, as it reduces concurrency to nondeterminism. As 
we explain below, we view this as the main weakness of configuration structures. 

In the following definition we slightly refine the notion of net configuration 
proposed in |Ej, as this may improperly include multisets of transitions that 
cannot be fired from the initial marking. 

Definition 6 (Prom PT Nets to Config. Structures j^). Let N = {do,di, 
S, T, uq), be a marked PT net. A finite multiset X of transitions is called fireable 
if there exists a partition Ai,...,A„ of X such that uq [Ai) ui...Un-i [Xn) Un 
is a step sequence. A eonfiguration of A is a fireable multiset X of transitions. 
The configuration structure associated to N is cs{N) = (T,Cn), where Cn is 
the set of configurations of N. 
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Fig. 2. The nets N and M of our running example. 
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Fig. 3. The configuration structure cs{N) = cs{M) for the nets N and M. 



It follows that for each configuration X the function ux '■ S ^ 1 given by 

Ux = Uq + E X{t) ■ (5o(t) - di{t)) 

teT 

is a (reachable) marking, i.e., 0 < ux{a) for all a G S. Moreover, if X is a 
configuration and ux [U) v, then X + U is also a configuration and v = ux+u- 
Generally speaking, if is a pure net, i.e., a net with no self-loops, cs{N) 
can be considered a reasonable semantics for N. Otherwise, as observed also 
in 0, it is not a good idea to reduce N to cs{N). Consider for example, the 
marked nets N and M of Figure 0 They have very different behaviours, indeed: 
in N the actions to and ti are concurrent, whereas in M they are mutually 
exclusive. However, since in M any interleaving of to and ti is possible, the 
diagonal 0 — > {to,ti| sneaks into the structure by definition. As a result, 
both N and M yield the configuration structure represented in Figure El even 
though {to,ti| is not an admissible step for M. The limit case is the marked 
net consisting of a single self-loop: the readers can check for themselves that, 
according to cs(_), it can fire arbitrarily large steps. 

These problems have prompted us to look for a semantic framework that 
represents net behaviours more faithfully than configuration structures. The key 
observation is that there is nothing wrong with the assumption that if a step 
involving many parallel actions can occur in a certain state, then all the possi- 
ble interleaving sequences of those action can also occur from that state. The 
problematic bit is assuming the inverse implication, because, as a matter of fact. 
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it reduces concurrency to nondeterminism and makes the set of configurations 
determine uniquely the transition relation. Our proposed solution is concurrent 
transition systems. 

1.3 Concurrent Transition Systems 

The analysis of the previous section suggests seeking a model that enforces the 
existence of all appropriate interleavings of steps, without allowing this to de- 
termine the set of transitions completely. Several such models appear in the 
literature. Among those that inspired us most, we recall distributed transition 
systems step transition systems m, PN transition systems cni, and higher 
dimensional transition systems 0 . Also closely related are the loeal event struc- 
tures of 0 , a model that extends event structures (rather than transition sys- 
tems) by allowing the firing of sets (but not multisets) of events. Drawing on all 
these, we have here chosen the simplest definition that suits our current aim. 

Definition 7. A concurrent transition system (CTS for short) is a structure 
H = (S', L, trans, sq), where S is a set of states, L is a set of actions, sq G S is 
the initial state, and trans C S x (/i(T) — { 0 }) x S is a set of transitions, such 
that: 

( 1 ) if (s, U, si), (s, U, S2) G trans, then si = S2, 

( 2 ) if (s, U, s') G trans and Ui, U2 is a partition of U , then there exist v\,V2 G S 
such that (s, Ui,v\), (s, C/2, U2), (ui, C/2, s'), (u2, C/i, s') G trans. 

Condition ( 1 ) above states that the execution of a multiset of labels C/ in a 
state s deterministically leads to a different state. The second condition guar- 
antees that all the possible interleavings of the actions in U are possible paths 
from s to s' if (s, U, s') G trans. Notice that, by ( 1 ), the states vi and V2 of ( 2 ) 
are uniquely determined. 

We formalise the idea that different paths which are different interleavings 
of the same concurrent step can be considered equivalent. 

Definition 8 . A path in a CTS is a sequence of contiguous transitions 

(s, C/i, Si)(si, C/2, S2) * * * (Syi_i, Uji, Sn). 

A run is a path that originates from the initial state. 

Definition 9. Given a CTS H, adjacency is the least reflexive, symmetric, bi- 
nary relation on the paths of H which is closed under path concatenation 
and such that 

(s, C/i, si)(si, C/2, S2) (s, Ui C/2, S2). 

Then, the homotopy relation on the paths of H is the transitive closure of 
. The equivalence classes of runs of H with respect to the homotopy relation 
are called computations. 
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In order to simplify our exposition, we now refine the notion of concurrent 
transition system so as to be able to associate to each path between two states 
the same multiset of actions. As we shall see, such transition systems enjoy 
interesting properties. 

Definition 10. A CTS is uniform if all its states are reachable from the initial 
state, and the union of the actions along any two cofinal runs yield the same 
multiset, where cofinal means ending in the same state. 

In a uniform CTS H = {S, L,trans, sq) each state s can be associated with 
the multiset of actions on any run to s. Precisely, we shall use Cs to indicate 
ioT {so,Ui, si){si,U2, S2)---{sn-i,Un, s) & TWO of H . Observe also that 
uniform CTS are necessarily acyclic, because any cycle (s, C/q, si) . . . (sn, Um s) 
would imply the existence of runs to s carrying different actions. In the rest of 
the paper, we shall consider only uniform concurrent transition systems. 

Introducing the natural notion of computation-preserving morphism for CTS, 
we define a category of uniform concurrent transition systems. In the following, 
for functions /: A — s- B, we denote by /^: n{A) /r(S) the obvious multiset 
extension of /, i.e., f^{X){b) = EaG/-i( 6 ) ^(“)- 

Definition 11. For H\ and H2 CTS, a morphism from Hi to H2 consists of a 
map /: S\ — > S2 that preserves the initial state and a function a: L\ — > L2 and 
such that {s,U,s') G transi implies {f{s),a^{U),f{s')) G trans2- 

We denote by CTS the category of uniform CTS and their morphisms. 

Definition 12 (Fhom PT Nets to CTS). Let N = {do,di, S,T,uq) be a 
marked PT Petri net. The concurrent transition system associated to N is 

ct{N) = {Mff, T, transN, 0), 

where Mn is the set of fireable multisets of transitions of N, and (A, U,X') G 
transN if and only if ux [U) ux'- (Recall that ux-S ^ Z is by definition a 
reachable marking.) 

Although this construction is formally very close to that proposed for con- 
figuration structures, the difference is that CTS do not enforce diagonals to fill 
the squares: these are introduced if and only if the associated step is actually 
possible (see Figure EJ- We shall give a precise categorical characterisation of the 
representations of nets in the CTS framework in Section El For the time being, 
we notice the following. 

Proposition 1. ct{N) is a functor from Petri* to CTS. 

Although all cofinal runs of a CTS carry the same multiset of actions, it is not 
the case that all such runs are homotopic, i.e., they do not necessarily represent 
the same computation. Enforcing this is the purpose of the next definition. 
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{to} f‘o tl} (tlj {to} {ti} 




ct{N) ct{M) 



Fig. 4. The CTS ct{N) and ct{M) for the nets N and M of Figure |21 



Definition 13. An occurrence concurrent transition system is a concurrent 
transition system H in which all pairs of eofinal transitions (si, C/i, s), (s 2 , 
s) S transH are the final steps of homotopic paths. 

It can be shown that the previous definition implies the following property. 

Proposition 2. All cofinal paths of an occurrence CTS are homotopic. 

We shall use oCTS to indicate the full subcategory of CTS consisting of 
occurrence CTS. Clearly, a uniform CTS can be unfolded into an occurrence 
CTS. 

Definition 14 (Prom CTS to Occurrence CTS). Let H = {S, L^trans, sq) 
be a concurrent transition system. Its unfolding is the occurrence concurrent 
transition system 0{H) = {S' ,L, trans',e), where S' is the collection of compu- 
tations of H, and 

trans' = [/, [7r']t^) | 3s, s' G S, [7r']t^ G S', tt' tG// 7r(s, C/, s')}. 

Proposition 3. 0(_) extends to a right adjoint to the inclusion of oCTS in 
CTS. 

Proof. For H a concurrent transition system, consider eh- 0{H) H that maps 
each [7r]t^ G Sq(^h) to its final state s G Sh- It is easy to verify that this forms 
the counit of the adjunction. 

1.4 Monoidal Categories 

Several interesting aspects of Petri net theory can be profitably developed within 
category theory, see e.g. Here we focus on the approach initiated in 

ini (other relevant references are jsjonaiiniEDi) which exposes the monoidal 
structure of Petri nets under the operation of parallel composition. In im E] it 
is shown that the sets of transitions can be endowed with appropriate algebraic 
structures in order to capture some basic constructions on nets. In particular, 
the commutative processes by Best and Devillers [Q, which represent the natural 



234 



Roberto Bruni et al. 



behavioural model for PT nets under the collective token philosophy, can be 
characterised adding a functorial sequential composition on the monoid of steps, 
thus yielding a strictly symmetric strict monoidal category T{N). 

Definition 15. For N a PT net, let T{N) be the strictly symmetric strict 
monoidal category freely generated by N. 



Using CMonCat to denote the category of strictly symmetric strict mono- 
idal categories and strict monoidal functors, T(_) is a functor from Petri to 
CMonCat. The category T{N) can be inductively defined by the following 
inference rules and axioms. 

u e /i(S'Af) t G T/v, do{t) = u, di{t) = V 

idu'u^uGT(N) t:u^vGT{N) 



a:u ^ V, j3: u' ^ v' G T{N) a\u^v, (3:v^wG T{N) 
a (B P-U + u' ^ V + v' gT (N) a-,/3:u^wGT (N) 



where the following equations, stating that 'T(N) is a strictly symmetric strict 
monoidal category, are satisfied by all arrows a, a', /3, /?', 7 , d and all multisets 
u and V. 



neutral: 
commutativity: 
associativity: 
identities: 
functoriality : 



id 0 (B a = a, 
a (B /d = /3 (B a, 

(a © /3) © i5 = a 0 (/3 0 (5), (a; /3); 7 = a; {(3; 7 ), 

CH, lid OL idy^ O, id y ©) id y id y-^y ^ 

(a; fd) © (o'; /3') = (a © o'); {fd © jd'). 



The intuition here is that arrows are step sequences and arrow composition is 
their concatenation, whereas the monoidal operator © allows for parallel compo- 
sition. It turns out that this algebraic structure describes precisely the processes 
a la Best and Devillers. 



Proposition 4 (cf. (Uj). The presentation of T{N) given above provides a 
complete and sound axiomatisation of the algebra of the commutative processes 
ofN. 

By analogy with Petri,, we take a pointed category (C, cq) to be a category 
C together with a distinguished object cq G C. Similarly, a pointed functor 
from (C,co) to (D,do) is a functor F: C ^ D that maps the distinguished 
object Co to the distinguished object do- Then, using CMonCat* to denote 
the category of pointed strictly symmetric strict monoidal categories and their 
pointed functors, the previous construction extends immediately to a functor 
%{N): Petri* — > CMonCat*, such that for N = {do, d\, S, T, ug) a marked PT 
net, then 



%(JV) = (T(do,di,S,T),uo). 
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1.5 A Logical Characterisation of the Algebraic Model 

The algebraic semantics of PT Petri nets can be expressed very compactly by 
means of a morphism between theories in partial membership equational logic 
(PMEqtl) ^21 , a logic of partial algebras with subsorts and subsort polymor- 
phism whose sentences are Horn clauses on equations t = t' and membership 
assertions t : s. Such a characterisation can have also practical applications, as 
there are tools available that support executable specifications in partial alge- 
bras. This section and the Appendix provide an informal introduction to the 
main ideas of PMEqtl. The interested reader is referred to im fT?l for self- 
contained presentations. 

A theory in PMEqtl is a pair T — (17, T), where 17 is a signature over a 
poset of sorts and T is a set of PMEqtl-sentences in the language of 17. We 
denote by PAlgj^ the category of partial 17-algebras, and by PAlg^^ its full 
subcategory consisting of T-algebras, i.e., those partial 17-algebras that satisfy 
all the sentences in T. 

The features of PMEqtl (partiality, poset of sorts, membership assertions) 
offer a natural framework for the specification of categorical structures. For in- 
stance, a notion of tensor product for partial algebraic theories is used in jl 2) to 
obtain, among other things, a very elegant definition of the theory of monoidal 
categories that we recall in the Appendix. More precisely, we define the theories 
PETRI of PT nets and CMONCAT of strictly symmetric strict monoidal categories, 
using a self-explanatory Maude-like notation (Maude 0 is a language recently 
developed at SRI International; it is based on rewriting logic and supports the 
execution of membership equational logic specifications). 

To study the relationships between PETRI and CMONCAT, the Appendix de- 
fines also an intermediate theory CMON-AUT of automata whose states form a 
commutative monoid. Our main result is then that the composition of the ob- 
vious inclusion functor of Petri into PAlg(;f,[)J[_J^uy and the free functor Tv from 
PAlgg„g^_4u.i. to PAlgg^g^g^.!. associated to the theory morphism V from CMON-AUT 
to CMONCAT corresponds exactly to the functor T(_): Petri — > CMonCat. 

Proposition 5. The functor T{_): Petri CMonCat is the composition 
Petri C s-PAlgcHUH -AUT 1 PAlgg^g^2J^.J. 

2 Concurrent Transition Systems and Monoidal 
Categories 

In this section we state the faithfulness of the CTS representation of nets, as given 
in Definition El with respect to the collective token philosophy. To accomplish 
this aim, we show that both the ct(_) and the T(_) constructions yield two 
equivalent categories of net behaviours. 

Regarding the monoidal approach, the obvious choice consists in taking the 
comma category of T{N) with respect to the initial marking, thus yielding a cat- 
egory whose objects are the commutative processes of N from its initial marking. 
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An arrow from process p to process q is then the unique commutative process r 
such that p; r = g in T{N). We denote the resulting category by (mq | T{N)). 

An analogous construction can be defined starting from ct{N). The first step 
is to observe that the paths of a generic CTS under the homotopy relation define 
a category. 

Definition 16. For H = {S, L, trans, sq) a CTS, we define the category of com- 
putations of H to be the category C{H) whose 

> objects are computations of H, 

> arrows are the homotopy equivalence classes of paths in H such that 

[7r]t^ ^ iff 

> arrow composition is defined as the homotopy class of path concatenation, 
i.e., 

bP']^ = 

> identity arrow at [7r]t^ is the homotopy class of the empty path at 

the final state of tt. 

This construction extends easily to a functor C(_) from CTS to Cat, the 
category of (small) categories and functors, yielding a functor C(ct(_)) from 
Petri* to Cat. Observe also that C(_) factors through 0: CTS ^ oCTS via the 
obvious path construction. 

Theorem 1. Let N he marked PT net with initial marking uq. Then, the cate- 
gories C{ct{N)) and (uq i T{N)) are isomorphic. 

Proof. We sketch the definition of functors 

F: (uo i T{N)) ^ C{ct{N)) and G:C{ct{N)) ^ {uq I T{N)) 

inverses to each other. The functor F maps an object of the comma category 
to the homotopy class of any of the object’s interleaving (which is well-defined 
because of the diamond equivalence of P^). Its action on morphisms is analogous. 

On the other hand, for a computation [7r]t^ in C{ct{N)), starting from the 
initial marking we can determine uniquely the corresponding arrow on T(N), 
and therefore define the action of G on both objects and arrows. 

The categories of computations for the concurrent transition systems associ- 
ated to nets N and M of Figure0are shown in FigureEl where we use cq and c\ to 
denote, respectively, the computations [(^2^, {to}, {to})]in^, and [(0,{ti},|ti})]t^ 
in both of ct{N) and ct{M). Analogously, pi and po indicate the homotopy 
classes of the paths [({tojj jti}, (to,ti})]t^ and [({ti}, {tojj (to, , respec- 

tively. However, cq;pi and ci;po yield the same result c= [(0, {to, ti}, {to, ti})]t^ 
in C{ct{N)), whereas in C{ct{M)) they denote different objects: c' = [{0,{to}, 
{to|)({to},{ti},{to,ti})]t^ and c" = [(0, (tij, {ti})({ti|, {to}, (to, ti})]t^- 
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C{ct{N)) C{ct{M)) 



Fig. 5. The categories C{ct{N)) and C{ct{M)) for the nets of Figure |2| 



3 Configuration Structures and Concurrent Transition 
Systems 

In this section we first give a categorical structure to the class of configuration 
structures, and then show that the obvious injection of configuration structures 
into CTS yields a reflection. 

Definition 17. For (ifi,Ci) and (£ 2 , 02 ) configuration structures, a cs-mor- 
phism from (Ei,Ci) to (£ 2 , 02 ) is a function g: E\ — > E 2 such that for each 
configuration X & C\, then g^{X) G € 2 - We denote by CSCat the category of 
configuration structures and cs-morphisms. 

The obvious injection functor U(_) from CSCat to CTS maps a configuration 
structure CS = {E, C) into the concurrent transition system 

3{CS) = (C,E,transcs,so), 

where transcs = {{X,L,Y) \ X — ^ T}, and maps a cs-morphism g\Ei — > E 2 
to the morphism (g',g), where g':Ci — > C 2 is the obvious extension g^^ of g to 
multisets, with domain restricted to Ci. 

Theorem 2. The functor U(_): CSCat — > CTS is the right adjoint of a func- 
tor 0?(_):CTS ^ CSCat. Moreover, since the counit of the adjunction is the 
identity, J(_) and 1R(_) define a full reflection. 

Proof. We sketch the proof, giving the precise definition of the reflection functor. 
The reflection functor 1R(_) maps a uniform CTS El = {S, L,trans, sq) into the 
configuration structure 3i{H) = (L, Cs) such that Cs = {q | s G S} (recall that 
(,s is the multiset union of the actions of any run leading to s). 

We denote the component at H of the unit of the adjunction by pH- H 

Theorem 3 (Configuration Structures via CTS). Let N be a marked PT 
net. Then cs{N) = 3l{ct{N)). 
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Proof. The events of cs{N), the actions of ct{N) and, therefore, the events of 
3l{ct(N)) are the transitions of N. The states S of the uniform CTS ct(N) 
are exactly the configurations of cs{N), and for each s € S', we have <^s = s. 
This suffices, since a configuration structure is entirely determined by its set of 
configurations. 

These results support our claim that configuration structures do not offer 
a faithful representation of net behaviours. In fact, 1R(_) clearly collapses the 
structure excessively, as the natural transformation associated to the reflection 
map p can identify non homotopic runs (e.g., c' and c" of FigureEI). 



Concluding Remarks and Future Work 

We have investigated the expressiveness of some ‘collective-token’ semantics for 
PT nets. In particular, to remedy the weakness of configuration structures, we 
have introduced concurrent transition systems — a version of higher dimensional 
transition system [3| more suited to the collective token philosophy, as they do 
not assign individual identities to multiple action occurrences in a multiset — 
and have shown that they can provide a faithful description of net behaviours. 



CP(-) 




Fig. 6. 



The diagram of functors, equivalences and natural transformations in Fig- 
ure El summarises the relationships between all these models. In the diagram, 
commutation on the nose (resp. natural equivalence) is represented by= (resp. ~), 
and p denotes the unit of the reflection into the subcategory of configuration 
structures. The functor CV{-) gives the category of Best-Devillers commutative 
processes. The functor cf(_) corresponds to the construction of the CTS for a 
given net, as defined in Section FTTn The functor C(_) yields the construction of 
the category of computations (i.e., homotopy equivalence classes of paths be- 
ginning in the initial state) of a CTS. The equivalence ~ between C(cf(_)) and 
(itin I T(_)) is shown in Sectional providing the faithfulness of the construction. 
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The functor cs(_) represents the abstraction from nets to configuration struc- 
ture, defined in Section o Unfortunately, CSCat is a reflective subcategory 
of CTS, as shown in Section 0 via the adjunction 1R(_) H U(_). The reflection 
functor 1R(_) identifies too many things, so that the natural transformation as- 
sociated to the reflection map p can identify non homotopic runs. Our running 
example shows that causality informations can get lost when using configura- 
tion structures, because homotopic paths are mapped into the same equivalence 
class. 





Structures 


Computation 

Model 


Behavioural 


Algebraic 


Logical 


Nets and Collective 
Token Philosophy 


Conf. structures, CTS, 
Commutative processes 


T(N) 


CAT ® CHON 


Nets and Individual 
Token Philosophy 


Cone. Pomsets, Event 
Struct., Processes 


P{N),Q{N) 

Z{N)1 


CAT (g) MON 
-1- SYM 



Table 1. 



The conceptual framework of this paper is summarised in Table Q which 
makes explicit our research programme on the behavioural, algebraic and logical 
aspects of the two computational interpretations of PT nets, namely the col- 
lective token and the individual token philosophies, from the viewpoints of the 
structures suited to each of them and their mutual relationships. 

The first row of Tabled has been treated in this paper. As for the individual 
token interpretation, obvious candidates for suitable behavioural structures are 
event structures, concatenable pomsets and, especially, various kinds of concaten- 
able processes isiEn] From the logical viewpoint, it is not difficult to formulate 
a theory SYM of permutations and symmetries (cf. bridging the gap from 
strictly symmetric categories to categories symmetric only up to coherent iso- 
morphism. On the other hand, the investigation of suitable algebraic models is 
still open, as our current best candidates, the symmetric strict monoidal cat- 
egories V{N) of concatenable processes 0 and Q{N) of strongly concatenable 
processes m, are both somehow unsatisfactory: V{-) is a non-functorial con- 
struction, a drawback that inhibits many of the applications we have in mind, 
whilst Q(_) solves the problem at the price of complicating the construction and 
relying on a non commutative monoid of objects. 

We are currently searching for a better categorical construction, say Z{N), 
based on a suitable notion of pre-net that may subsume and underly the theory 
of PT nets and allow us to complete our programme. 

Also, the complete analysis and comparison of bisimulation related issues in 
the various models considered in the paper (as in for configuration structures) 
deserve further work that we leave for a future paper. 
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Appendix. Recovering the Algebraic Semantics of Nets via 
Theory Morphisms 

In order to define the theory of strictly symmetric strict monoidal categories, we 
first recall the definition of the theory of categories from m 

The poset of sorts of the PMEqtl-theory of categories is Object < Arrow. 
There are two unary operations d(_) and c(_), for domain and codomain, and 
a binary composition operation _ ; _ defined if and only if the codomain of the 
first argument is equal to the domain of the second argument. Functions with 
explicitly given domain and codomain are always total. 

fth CAT is 
sorts Object Arrow, 
subsort Object < Arrow, 
ops d(_) c(_) : Arrow -> Object, 
op _ ; _ . 
var a : Object, 
vars f g h : Arrow, 
eq d(a) = a. 
eq c(a) = a. 
ceq a;f = / if d(/) == a. 
ceq f;a = / if c(/) == a. 
cmb f;g : Arrow iff c(/) == dig). 
ceq d(f;g) = d(/) if c(/) == dig). 
ceq cif;g) = cig) if cif) == dig). 

ceq if;g);h = f;ig;h) if cif) == dig) and cig) == dih) . 
endfth 

The extension of the theory CAT to the theory of monoidal categories is 
almost effortless thanks to the tensor product construction of theories, which is 
informally defined as follows. 
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Let T = (f2,r) and T' = {n',r') be theories in partial membership equa- 
tional logic, with 17 = (S,<,S) and 17' = (S' , E'). Their tensor product 
T (g) r' is the theory with signature 17 (g) 17' having : poset of sorts ( 5, < ) x ( S" , < ' ) , 
and signature E (g) E', with operators fi £ {E (§> E')n and pr £ {E ® E')m for 
each / £ En and g £ E'^ (indices I and r stand respectively for left and right 
and witness whether the operator is inherited from the left or from the right 
component). The axioms of T (g> T' are the determined from those of T and T' 
as explained in m 

The essential property of the tensor product of theories is expressed in the 
following theorem, where PAlgj^(C) indicates the category of T-algebras taken 
over the base category C rather than over Set, the category of small sets and 
function. 

Theorem 4. Let T , T' he theories in partial membership equational logic. Then, 
we have the following isomorphisms of categories: 

PAlgjn(PAlg2’/) — PAlgy^y/ ~ PAlgjn/ (PAlgy). 

To define the theory of monoidal categories, we introduce a theory CMON of 
commutative monoids and apply the tensor product construction. Here we ex- 
ploit the possibility given by Maude of declaring the associativity, commutativity 
and unit element as attributes of the monoidal operator. 

fth CMON is 
sort Monoid, 
op 0 : -> Monoid. 

op _0_ : Monoid Monoid -> Monoid [assoc comm id; 0] . 
endfth 

The theory of strictly symmetric strict monoidal categories is then defined 
as follows. Notice also the use of left and right corresponding to the indices I 
and r discussed above. 

fth CMONCAT is CMON (g) CAT renamed by ( 
sort (Monoid, Object) to Object, 
sort (Monoid, Arrow) to Arrow, 
op 0 left to 0. 
op _0_ left to _0_. 
op right to 
op d(_) right to d(_) . 
op c(_) right to c (_).). 
endfth 

In order to define a theory in PMEqtl that represents PT Petri nets and 
their morphisms, we first introduce a theory whose models are automata whose 
states form a commutative monoid. 
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fth CMON-AUT is 
sorts State Transition, 
op 0 : -> State. 

op _(8>_ : State State -> State [assoc comm id: 0] . 
ops origin(_) destination(_) : Transition -> State, 
endfth 



Proposition 6. The category Petri is a full subcategory o/ PAlgg„gj,„ 4 u.i.. 

Proof. It is immediate to check that each PT net is just a model of CMON-AUT 
whose states are the object of the commutative monoid freely generated by the 
set of places. 

Exploiting the modularity features of Maude, we can characterise Petri as a 
subcategory of PAlgg^g^_ 4 u.i.. We import a functional module MSET[E : : TRIV] 
of multisets, parametrised on a functional theory of TRIV of elements, whose 
models are sets corresponding to the places of the net. 

fth TRIV is sort Element . 

endfth 

fmod MSET[E : : TRIV] is 
sort MSet . 

subsort Element < MSet . 
op 0 : -> MSet. 

op _+_ : MSet MSet -> MSet [assoc comm id: 0] . 

endfm 

fth PETRI [S : : TRIV] is 

protecting MSET [S] renamed by (sort MSet to Marking.), 
sort Transition. 

ops pre(_) post(_) : Transition -> Marking. 

endfth 

A theory morphism H from T to T' , also called a view in Maude, is a map- 
ping of the operators and sorts of T into T' , preserving domain, codomain and 
subsorting, and such that the translation of the axioms of T are entailed by those 
of T'. It originates a forgetful functor 11^^: PAlgj^/ — > PAlgj. that — for T and 
T' theories without freeness constraints, such as those required in PETRI [S] — 
admits a left adjoint T h'- PAlgj^ ^ PAlgjn/ whose effect is to lift H to a free 
model construction in PAlgj., . The inclusion functor from Petri to PAlg(,f,P^[_J^u^ 
is induced as the forgetful functor of a theory morphism I specified as a view in 
Maude as follows. 

view I from CMON-AUT to PETRI [S : : TRIV] is 
sort Marking to MSet . 
op origin(_) to pre(_). 
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op destination(_) to post(_). 
op 0 to 0. 
op to 

endview 

Finally, the algebraic semantics of PT nets under the collective token phi- 
losophy, i.e., the construction T(_), can be easily recovered via a simple theory 
morphism specified in Maude-like notation as 

view V from CMON-AUT to CMONCAT is 
sort State to Object, 
sort Transition to Arrow, 
op origin(_) to d(_) . 
op destination(_) to c(_). 
endview 

As stated in Proposition El the construction T(_): Petri ^ CMonCat is 
then the following functor composition. 

Petri ^ > PAlg(.f,Qj[_j^uY y PAlgg^gJ[gJ^.J. 
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Abstract. Chi calculus was proposed as a process algebra that has a 
uniform treatment of names. The paper carries out a systematic study 
of bisimilarities for chi processes. The notion of L-bisimilarity is intro- 
duced to give a possible classification of bisimilarities on chi processes. 
It is shown that the set of L-bisimilarities forms a four element lattice 
and that well-known bisimilarities for chi processes fit into the lattice 
hierarchy. The four distinct L-bisimilarities give rise to four congruence 
relations. Complete axiomatization system is given for each of the four 
relations. The bisimulation lattice of asynchronous chi processes and that 
of asymmetric chi processes are also investigated. It turns out that the 
former consists of two elements while the latter twelve elements. Finally 
it is pointed out that the asynchronous asymmetric chi calculus has a 
bisimulation lattice of eight elements. 



The X“Calculus (0) was introduced with two motivations in mind. One is to 
remove the ad hoc nature of prefix operation in 7r-calculus m) by having a uni- 
form treatment of names (0), thus arriving at a conceptually simpler language. 
The second is to materialize a communication-as-cut-elimination viewpoint (|S|)) 
therefore taking up a proof theoretical approach to concurrency theory, an ap- 
proach that has been proved very fruitful in the functional world. Independently 
Parrow and Victor have come up with essentially the same language, Update 
Calculus as they term it m)- They share, we believe, the first motivation but 
have quite a different second one originated from concurrent constraint program- 
ming. The difference between tt and \ mainly in the way communications 
happen. The former adopts the familiar value-passing mechanism whereas the 
latter takes an information exchange or information update viewpoint. The al- 
gebraic theory of the language has been investigated in the above mentioned pa- 
pers. Parrow and Victor have looked into strong bisimilarity and axiomatization 
of it for Update Calculus, while Fu has examined an observational bisimilarity 
for x-processes. More recently, Parrow and Victor have proposed Fusion Calculus 
(mni), which is a polyadic version of X“Calculus. The authors have also stud- 
ied an observational equivalence called weak hyperbisimilarity. What we know 
about the language, albeit little, tells us that it can practically do everything tt 
can do and its algebraic properties are just as satisfactory. The studies carried 
out so far are however preliminary. 
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The objective of this paper is to continue our examination of the algebraic 
theory of x-calculus. Section previews the operational semantics of %. Sectional 
defines L-bisimilarities and investigates their relationship. Section 21 gives alter- 
native characterizations of L-bisimilarities. Section 0 presents a complete ax- 
iomatization system for each of the congruence relations induced by the four 
L-bisimilarities. The next three sections look into the L-bisimilarities of asyn- 
chronous, asymmetric, asynchronous asymmetric x-calculi respectively. 



1 Operational Semantics 



In TT-calculus there are two kinds of closed names, one has dummy names as x 
in m(x).P and local names as x in (x)mx.P. In simple words, the x-calculus 
is obtained from 7r-calculus by unifying these names. This identification forces 
a unification of input and output prefix operations. The two 7r-processes just 
mentioned then turn into (x)m[x].P and (x)m[x].P respectively. In the result- 
ing calculus communications are completely symmetric as exemplified by the 
following reductions: 

m[x].P\m[x].Q P\Q 

{x){R\{m[y].P\m[x].Q)) R[y/x]\{P[y/x]\Q[y/x]), where y fy x 

{x)rn[x].P\{y)m[y].Q {z){P[z/x\\Q[z/y\), where 2 is fresh 



The reader is referred to pi4l5l6lldll4ll7| for more explanations and examples. 

Let Af be a set of names, ranged over by lower case letters. AT, the set of 
conames, denotes {x | x G Af}. The following conventions will be used: a ranges 
over Af U A/", /r over {r} U {a[x\,ax \ x G A/"}, and 5 over {r} U |a[x], ax, [y/x] \ 
X, y G N}. The set C of x-processes are defined by BNF as follows: 



P := 0 I a[x].P I P\P I (x)P I [x=y]P \ P+P 



The process a[x].P is in prefix form. Here a or a is the subject name, and x 
the object name, of the prefix. The composition operator “|” is standard. In 
{x)P the name x is declared local; it cannot be seen from outside. The set of 
global names, or nonlocal names, in P is denoted by gn{P). We will adopt the 
a-convention saying that a local name in a process can be replaced by a fresh 
name without changing the syntax of the process. The choice combinator ‘-I-’ 
is well-known. The process P+Q acts either as P or as Q exclusively. In this 
paper we leave out the replication operator. The result of this paper would not 
be affected had it been included. 

The operational semantics can be defined either by reduction semantics (0) 
or in terms of a labeled transition system (0). Here we opt for a pure transition 
semantics as it helps to present our results with clear-cut proofs. The labeled 
transition system given below defines an early semantics. The reason to use an 
early semantics is that the definition of weak bisimulation is more succinct in 
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early semantics than in late semantics. In the following formulation, symmetric 
rules are systematically omitted: 



P^P' 

P\Q 



p pi 



P\Q P'\Q[y/x] 



Cmp^ 



p P^p' Q _U Q' ^ p 

P\Q ^ P'\Q' 



P' Q^Q' x^gn{P\Q) 
P\Q ^ {x){P'\Q') 



Cmmi 



p^p, Q^Q, P 

-Cmm2 



P\Q^P'\Q' 



P' x^y 

P\Q^^ P'[y/x]my/x] 



’ — > P' X ^ n(S) 
(x)P (x)P' 



LoCn 



P^P' x^{a,a} 
{x)P ^ P'[y/x] 



Loci 



p ^yM p. 



{x)P P' 



7L0C2 



Labeled transitions of the form called update transitions, are first intro- 
duced in prnj to help define communications in a transition semantics. In ap- 
plying Loci local names need be renamed if necessary to prevent y from being 
captured. In Locq, n(S) denotes the set of names appeared in S. The notation 
[y/x] occurred in P[y/x] for example is an atomic substitution of y for x. A 
general substitution a is the composition of atomic substitutions, whose effect is 

def 

defined by P[yi/a:i] . . . [y„/a;„] = (P[yi/a:i] . . . [yn-i/xn-i])[yn/xn]- The com- 
position of zero atomic substitution is an empty substitution [] whose effect is 
vacuous. 

The next lemma collects some technical results whose proofs are simple in- 
ductions on derivation. 



Lemma 1. (i) If P P' then Pa P'a. 

(ii) If P P' and xa ya then Pa ^ P' a[ya /xa]. 

(in) If P P' and xa = ya then Pa — ^ P'a. 

(iv) If P P' then P P'[x/y\. 

(v) P P' if and only if P Pi for some fresh z such that P' = Pi[x/z]. 

(vi) Suppose a ^ gn{P). If (a;)(P|a[a;]) ^ P' then (a;)(P|a[x]) ^ > P'. 

(vii) Suppose a ^ gn{P). If (a:)(P|a[x]) — ^ P'\a[y] then P Ph 

Let be the reflexive and transitive closure of — We will write 

(=^) for (=>— will also write (=^) for (=^) 

li pL ^ T {5 ^ t) and for => otherwise. A sequence of names cci, . . . , a;„ will be 
abbreviated to x\ and consequently {xi) . . . {xn)P will be abbreviated to {x)P. 
When the length of x is zero, {x)P is just P. 
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2 Bisimulation Lattice 



We introduce in this section L-bisimilarities, which are refinement of the local 
bisimilarity of The reason to study L-bisimilarities is that they provide a 
framework to understand bisimilarity relations of interest. 

In a symmetric calculus such as y, it does not make much sense to say that an 
action with positive, respectively negative, subject name is an input, respectively 
output, action. An action is an input or output, depending on if the object name 
is being received or being sent out. Let o denote the set |a[a:] | a,x S N} of 
output actions, o the set |a[a:] \ a,x & N} of co-output actions, i the set {ax \ 
a,x G Af} of input actions, i the set {ax | a, a; € Af} of co-input actions and u the 
set {[y/x] \ x,y G Af} of updates. Let £ stand for {US' | S C {o,o,i,i,u}AS yf 0}. 

Definition 1. Let TZ be a binary symmetric relation on C and let L be an ele- 
ment of C. The relation TZ is an L -bisimulation if whenever PTZQ then for any 
process R and any sequence x of names it holds that if (x){P\R) — ^ P' for 

(j) G L U |r} then there exists some Q' such that (x){Q\R) Q' and P'TZQ' . 
The L -bisimilarity is the largest L -bisimulation. 

This is a uniform definition of 31 L-bisimilarities. The intuition behind is that 
is what an observer recognizes if he/she is capable of observing actions in 
L and only in L. We will show that the L-bisimilarities collapse to four distinct 
relations. In the rest of this section let L be an arbitrarily fixed element of C. 
First we establish a few technical lemmas. The next one follows directly from 
definition. 



Lemma 2. If P Pi Q and Q Qi L* then P Q- 

For (j) G L, let {(j)) be a process such that (i) {(f) 0 and (ii) if ((/>) A then 

A = 0. 

Lemma 3. Suppose a ^ gn{P\Q). Then (i) {x){P\a[x\) (x)(Q|a[a;]) implies 

P ~L Q; and (ii) P|a[x] Q|a[a;] implies P Q. 

Proof, (i) Suppose 4> G L and n(^) n gn{P\Q) — 0. As (x)(P|a[a;])|a[x].((/>) 
(P|0)|0, Qi exists such that (a;)((5|a[a;])|a[a;].((^) (Qi|0)|0«l (P|0)|0, which 

implies (a;)(( 5 |a[ 2 ;]) Qi|0 P|0, which in turn implies Q Qi. Similarly 
Pi exists such that P Pi ~l Q- By Lemma 0 P Q. (ii) can be proved 
similarly. □ 



Lemma 4. If P ~l Q then Pa Qa for an arbitrary substitution a. 

Proof. Suppose P Q- We only have to show that for x G gn{P\Q) and y ^ x 
one has that P[y/x\ Q[y/x\. Let 5 be a distinct fresh name. Suppose 4> G L 
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and n{4>) n gn{P\Q) = 0. By definition the actions {x){P\{b[y]\b[x].{(j)))) 
P[?//a;]|( 0 | 0 ) must be matched up by 

{x){Qmy]\b[x].m)^Qimo). (i) 

If {x){Q\{b[y]\b[x].{(j)))) {x'){Q 2 \{b[y]\b[x' ].{(!>))) then by symmetry and a- 

convention the reduction is the same as 

{xmmmm) ^ {xm.myMxum 

such that Q — ^ Q 3 . It follows that m can be factorized as follows 

{x){Q\{b[y]\b[x].{m {x){Q'my]\b[x].{m 

^Q'[y/x]\{Om) 

^Qi|( 0 | 0 ) 

for some Q' and Qi such that Q => Q' and Q'[y/x] Qi P[y/x]- By 
Lemma □ Q[y/ a;] Q'[y/x]. Similarly Pi exists such that P[y/x] Pi 

Q[y/x], By LemmaEl P[y/x] Q[y/x]. □ 

By definition the L-bisimilarity is closed under localization and composition. 
Using Lemma0it can be easily seen that is closed under prefix operation. 

Theorem 1. If P Q and O G C then (i) 0 ![a;].P a[x].Q; (ii) P\0 

Q\0; (in) {x)P {x)Q; and (iv) [x=y]P [x=y]Q. 

We investigate next the order structure of L-bisimilarities. 

Theorem 2. The following properties hold of the L-bisimilarities: 

(i) (ii) ^ L^'^u- (Hi) 

Proof, (i) It is obvious that (a;)a[a;].(5)(6[a:]|5[z]) 9^0 a[z]-\-{x)a[x].{b){b[x]\b[z]). 
It takes a while to see that (x)a[a;].(5)(6[a:]|5[z]) a[z]-\-{x)a[x\.{b)(b[x\\b[z\). 

(ii) To prove one only has to show that if P Q and P P' 

then Q' exists such that Q Q' and P' Q' ■ Now P p' implies that 
(a;)(P|a[a;]) — ^ P'\a[y] for a fresh a. So {x){Q\a[x]) Q'\a[y] for some Q' 

such that P'\a[y] Q'\a[y\. It follows from Lemma 0 that P' Q' . Clearly 

{x){Q\a[x\) Q'\a[y] can be factorized as 

(a;)(Q|a[a;]) ^ (a;)(Qi |a[a;]) ^ Q 2 \a[y\ Q'\a[y], 

where Q Q\. By (vii) of Lemma □ (a;)(Qi| a [a;]) Q 2 \a[y] implies Qi 

Q 2 - Hence Q Q' . 

(iii) Assume P Q and P P' . Suppose 4> G L and n{4>) C gn{P\Q) = 0. 

Now P\(a[z\-\-{4>)) — ^ Pi|0 for some fresh z such that P' = Pi[a;/z]. There has 
to be some Q\ such that Q\(a[z\-\-{4>)) Qi |0 Pi|0. So Q Q\ KiL Pi- 
Therefore Q Q\\xjz\ Pi[a:/z] = P' by Lemma [D and Lemma El Hence 
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Fig. 1. The Bisimulation Lattice of Chi Processes 



In the proof of (iii) of the above theorem, we need to use a fresh name z 
because we cannot conclude R R' from ^ R'\0. It may well be that 

R participates in the communication by performing R — i R' . 

Let C{x) be {~l| L G £}, the set of all L-bisimilarities. £(x) is a partial 
order when equipped with C. For «Li, 'C(x), is the infimum. Theo- 

remElsays that £(x) is a four element lattice. The diagram in Fig.Qis a pictorial 
representation of the lattice. In the diagram each node is the principal repre- 
sentative of a number of L-bisimilarities that boil down to a same relation. An 
arrow indicates a strict inclusion. The bottom element is represented by «oUo 
while the top element is by We will call (£(x)j C) the bisimulation 

lattice of x-processes. The lattice structure suggests that the ability to observe 
output actions is stronger than that to observe input actions. One way to un- 
derstand this is that the effect of an output action is unknown whereas that of 
an input action has already been delimited. 



3 Alternative Characterization 

The definition of L-bisimilarity is natural but intractable. It contains universal 
quantifications over both processes and names. In this section alternative char- 
acterizations of the four distinct L-bisimilarities are presented. For each of the 
relations, an open style bisimilarity is shown to coincide with it. In the alterna- 
tive definitions, one still has a universal quantification over substitutions. But it 
is clear that one only has to consider a finite number of them at each step. First 
we will see how barbed bisimilarity fits into the lattice hierarchy. 



3.1 Barbed Bisimilarity 

Barbed bisimilarity (C2I) seems to be the weakest bisimulation equivalence pro- 
posed so far. It applies to a whole range of process calculi and therefore acts as 
a convenient tool to study relationships between different calculi. The relation- 
ship of barbed bisimilarity to other bisimilarities is itself an interesting question. 
Usually it is easy to show that a given bisimilarity is included in the barbed one. 
The difficulty is in deciding if the inclusion is strict. This section answers the 
question for x-pi'ocesses. 
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Definition 2. A process P is strongly barbed at a, notation P[a, if P P' 

or P — i P' for some P' such that a G {a, a}. P is barbed at a, notation 
if some P' exists such that P P' la. A binary relation TZ is barbed if 
Va G Af.Pila Q-IJ-a whenever PTZQ. 



Definition 3. Let TZ be a barbed symmetric relation on C. It is called a barbed 
bisimulation if whenever PTZQ then for any R and any sequence x of names it 
holds that if {x){P\R) — ^ P' then Q' exists such that (a;)(Q|ii) Q' and 
P'TZQ' . The barbed bisimilarity is the largest barbed bisimulation. 

The next result locates in the bisimulation lattice. 

Theorem 3. is the same as 

Proof. As is clearly barbed, To prove the reverse inclusion, first 

notice that is closed under substitution, the proof of which is similar to 
that of Lemma 0 Now suppose P ~b Q and P P' . By (v) of Lemma E 

P for some fresh z such that P' = P\[x/z\. Let a be fresh. Now 

P|(a[z].a[a]|a[a]) ^ ^ > Pi|(0|0). This sequence of reductions must be matched 

up by QI(«[-z]-a[o]|a[a]) Qi|(0|0) Pi|(0|0). There are only two ways for 

Q to evolve into Q\. either Q Qi or Q Qi. The former is impossi- 
ble because z is fresh. Hence Q Qi Pi. It follows from Lemmas that 
Q Qi[a;/z] Pi[a;/z] = P' . Conclude that □ 

3.2 Open Bisimilarity 

Open bisimilarity is proposed by Sangiorgi in m for TT-processes as a “correction 
of late bisimulation” . This section defines open bisimilarity for y-processes and 
relates it to one of the L-bisimilarities. 

Definition 4. Let TZ be a binary symmetric relation on C. It is called an open 
bisimulation if whenever PTZQ then for any substitution a it holds that if Pu — ^ 

P' then Q' exists such that Qa Q' and P'TZQ' . The open bisimilarity ^open 
is the largest open bisimulation. 

It is clear from the definition that ^open is closed under substitution. 

Lemma 5. ~open is closed under localization and composition. 

This easy lemma can be proved by constructing appropriate bisimulations. 
Theorem 4. ^open coincides with «oUo- 

Proof. By Lemma 0 and the proof of Theorem E| (o U o)-bisimilarity is an open 
bisimulation. By Lemma 0 open bisimilarity is an (o U o)-bisimulation. □ 



252 



Yuxi Fu 



The definition of open bisimilarity makes it easy to give axiomatization sys- 
tem for finite y-processes. Theorem 0 enables us to axiomatize the bottom el- 
ement of the bisimulation lattice C{x)- In order to do the same for the other 
three elements of the bisimulation lattice, one would like to characterize these 
three L-bisimilarities in an ‘open’ style, so to speak. This is precisely what we 
are going to do next. The intuition for the following definition comes from the 
example given in the proof of (i) of Theorem |21 

Definition 5. Let TZ be a binary symmetrie relation on C. It is ealled an open 
i -bisimulation (open o -bisimulation, openo-bisimulation) if whenever PTZQ then 
for any substitution a it holds that 

(i) if Per P' for (j) G iL>iLluLl{T} (cj) G oUiUiUMU{r}, (j) G bUiUiUuU{r}^ 
then Q' exists sueh that Qer Q' and P'TZQ' ; and 

(ii) if Pa — i P' (Pa — P' , Pa — P') then some Q' exists such that 

P'TZQ' and either Qa Q' (Qa =y- Q', Qa Q') or Qa Q' 

(Qa Qa Qi J jgj. gQjjif. fj-ggfi z. 

The open i -bisimilarity (open o-bisimilarity, open o -bisimilarity), denoted by 
^open ^^6 largest Open i -bisimulation (open o-bisimulation, 

open o-bisimulation) . 

The next theorem explains the reason to have Definition 0 
Theorem 5. (i) = (H) ~open = ~o; (in) «open=~*- 

Proof. It is enough to see how to establish (iii). Suppose P Q and P P' . 

[x/z] 

For a fresh name z one has T’|o;[-2] P'\0. As Q' exists such that 

Q|a[z] Q'\0 Rii P'|0. There are only two possibilities: either Q Q' P' 

or Q Q' p' Hence by Lemma 0 

To prove the reverse inclusion we use the fact that is by definition closed 

under substitution. We only have to show that the composition and localization 
operators preserve the relation «ope„. To prove that {((u)(P|i?), (u)(Q|ii)) | 
P Q} is an open Tbisimulation, it is sufficient to examine the cases where 

P performs an output action. We consider only one case: 

— Suppose that (v){P\R) (u){P'[z/y]\R[z/y]), where y G {u}, is derived 

from P P' by applying rule Loci. Suppose P P' is matched up 
by Q Qi Q' for some Qi, Q' and fresh w. By Lemma 0 Q 
Qi[z/w] Q'[zlw][y/ z] = Q'[y/z\ as w ^ gn(Q'). By (iv) of Lemma0 
Q Qi[z/w] Q'[y/z\[z/y] = Q'[z/y]. It follows that 



{v){Q\R) ^ {v){Q,[z/w]\R) {u){Q'[z/y]\R[z/y]) 

matches up (t))(P|i?) {u){P' [z / y\\R[z / y\) . 

The closure property enables us to conclude that ~open is an i-bisimulation. □ 
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The open style bisimilarities are particularly suitable for y-like process calculi 
in which names are uniform. 

4 Axiomatization 

The L-bisimilarities are not congruence relations. To get a congruence relation 
from we use the standard approach. 

Definition 6. P Q if and only if P ~l Q and whenever P — ^ P' (Q — ^ 
Q' ) then Q' (P' ) exists such that Q Q' (P P' ) and P' Q' ■ 

We will call =l the L-congruence on x-processes. According to the bisimula- 
tion lattice, there are only four distinct L-congruence relations. The aim of this 
section is to give complete axiomatization systems for all the four distinct L- 
congruence relations. In this section we omit most of the proofs since they are 
very much similar to the corresponding proofs in |7TT!| . In the axiomatization we 
need a prefix operation that generalizes the standard r-operation. It is defined 
as follows: 

[y\x].P (a){a[y]\a[x].P) where a is fresh. 

The next result is known as Hennessy Lemma. It is very useful in the study of 
axiomatization systems. 

Lemma 6. P Q if and only if either [a;|a;].P =l Q or P =l Q or P =l 
[ x\x].Q. 

For purpose of axiomatization, let tt and 7 range over {a[a;], [y\x] | a;, y S Af}. 
In the rest of this section M and N denote finite lists of match equalities x=y. 
Suppose M is xi=y\, . . . ,Xn=yn- Then [M]P denotes [x\=yi] . . . [Xn=yn]P- If 
M logically implies TV, we write M ^ N; and if both M ^ N and N ^ M we 
write M N. li M is an empty list, it plays the role of logical truth, in which 
case [M]P is just P. Clearly a list M of match equalities defines an equivalence 
relation on the set n{M) of names appeared in M. We use ctm to denote an 
arbitrary substitution that replaces all members of an equivalence class by a 
representative of that class. 

In [I ti] Sangiorgi presents a complete system for strong open congruence on 
TT-processes. Parrow and Victor consider in their paper (IE)) two axiomatization 
systems for strong hyperbisimilarity, one with match operator and one without 
match operator. In Fig. 0a conditional equational system AS is defined together 
with some derived rules and their justifications. The expansion law is the follow- 
ing equation: 



Y.j Wj] (yhj ■{P\Qj)+ ’] [Nj]{x) (y) [ai=bj] [Xi\yj] . (Pi\Qj) 



where P is ^-[Mi](a;)7ri.Pi and Q is J2jWj]{y)lj-Qj- Rules concerning equiva- 
lence and congruence relations have been omitted. This is Victor and Parrow’s 
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LI 


(®)0 = 


0 




L2 


{x)a[y].P = 


0 


X e {a, a} 


L3 


{x)a[y].P = 


a[y\.[x)P 


X ^ {y,a,a} 


L4 


ix){y)P = 


{y){x)P 




L5 


{x)[y=z]P = 


[y=z]{x)P 


X ^ {y,zj 


L6 


{x){P+Q) = 


{x)P+{x)Q 




L7 


{x)[x=y]P = 


0 




L8 


{x)[y\x].P = 


[y\y].P[y/x\ 


xRy 


L9 


{x)[y\z].P = 


[y\z].{x)P 


X {y,z} 


Ml 


[M]P = 


[N]P 


if M<+. A 


M2 


[x=y]P = 


[x=y]P[y/x] 




M3 


[x=y]{P+Q) = 


[x=y]P+[x=y]Q 




SI 


P+0 = 


P 




S2 


P+Q = 


Q+P 




S3 


P+iQ+R) = 


{P+Q)+R 




S4 


[x=y]P+P = 


P 




U1 


[y\x].P = 


[x\y].P 




U2 


[a;|a;].P = 


[y\y]-P 




U3 


[y\x].P = 


[y\x].[x=y]P 




\Expansion Law \ 


LDl 


{x)[x\x].P = 


[y\y]-{x)P 


U2 and L9 


MDl 


[x=y].0 = 


0 


SI and S4 


MD2 


[x=x].P = 


p 


Ml 


MD3 


[M]P = 


\M]{Pgm) 


M2 


SDl 


P+P = 


p 


MD2 and S4 


SD2 


[M]P+P = 


p 


S-rules 


UDl 


[y\x\.P = 


[y\x].P[y/x] 


U3 and M2 



Fig. 2. Axiom System AS and its Derived Rules 

system without mismatch. The only difference is that we use a symmetric update 
prefix operator. 

We will write AS U {i?i, . . . , i?„} h P = Q to mean that the equality P = Q 
is derivable from the rules and axioms of AS together with rules Ri, . . . , Rn- 
When no confusion arises, we simply write P = Q. We will also write P = Q to 
indicate that R is the major axiom applied to derive P = Q. 

Definition 7. A process P is in normal form if P = '^i^j^[Mi]ai[xi].Pi + 
[Mi\{x)a^[x].P^ + J2^^J^ [Mi][zi\yi].Pi such that x does not appear in P and 
Pi is in normal form for each i G R U I 2 U h- Here I\, R and R are pairwise 
disjoint finite indexing sets. 

Notice that if P is in normal form and cr is a substitution then Pa is in nor- 
mal form. The depth of a process measures the maximal length of nested pre- 
fixes in the process. The structural definition goes as follows: (i) <i(0) = 0; 
(ii) d{a[x].P) = 1 + d{P); (iii) d{P\Q) = d{P) + d{Q)- (iv) d{{x)P) = d{P)- (v) 
d([x=y]P) = d(P); (vi) d{P+Q) = max{d{P),d{Q)}] (vii) d{[x\y].P) = l+d{P). 
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T1 a[x].[y\y].P = a[x\.P 

T2 P +[y\y].P = [y\y].P 

T3 a[x\.{P + [y\y\.Q) = a[x].{P + \y\y].Q) + a-[x\.Q 

TDl [x\z\.[y\y].P = [x\z\.P 

TD2 \x\z\.[P + [y\y].Q) = \x\z].[P + [y\y].Q) + [x\z].Q 

01 (z)m[z].(P + = {z)m[z].{P + [x\z].Q) + m[x\.Q x ^ z 

02 {z)m[z].{P + = {z)m[z].{P + [xjaj.Q) + m[x].Q x z 

03 [z)a[z].{p + [x\z].Q) = {z)a[z].{P + [x\z].Q) + a[x].Q x z 



Fig. 3. The Tau Laws and the Output Laws 



Lemma 7. In AS each process P is provably equal to some P' in normal form 
such that d{P') < d{P). 

We need the x-version of the well-known tau laws as given in Fig.0 TDl and 
TD2 are derivable from T1 and T3 respectively. 

The tau laws are enough to characterize =ouo- To get complete systems for 
the other three L-congruence relations, we need what we call output laws Ol, 
02 and 03 given in Fig. 01 

The following lemma is crucial to the proof of the completeness theorem. 
Lemma 8. Suppose Q is in normal form. Then 

(i) if QaM Q' then AS U {Tl, T2, T3} Q = Q + [M] [a;|a;].Q' for each x; 
(a) if Qcfm =y- Q' then AS U {Tl, T2, T3} \- Q = Q + [M]a[x].Q' ; 

(Hi) if z ^ gn{Q) U n{M) and Qum Q' then AS U {T1,T2,T3} h Q = 
Q + [M]{z)a[z].Q' ; 

(iv) if QaM Q' then AS U {Tl, T2, T3} \- Q = Q + 

(v) if z ^ gn{Q) U n{M) and QaM 211^ Q' then AS U {Tl, T2, r3}U {Ol} h 
Q = Q + [M]m[x].Q'; 

(vi) ifz^ gn{Q) U n(M) and QaM Q' then AS U (Tl, T2, T3| U {02} h 

Q = Q+ [AI]rri[x\.Q' ; 

(vii) if z ^ gn{Q)Un{M) and QaM Q' then ^S'U{T1,T2,T3}U{03} h 

Q = Q+ [AI]a[x].Q' . 

Proof, (i) through (iv) are proved by inductions on derivation, (v) through (vii) 
are proved using (iii) and (iv). □ 



Theorem 6. We have the following completeness results: 

(i) AS U {n,r2,T3} is sound and complete for =ouo,' 

(a) AS U {n,r2,r3}U {Ol} is sound and complete for =o', 

(iii) AS U {Tl, T2, T3} U {02} is sound and complete for =„; 

(iv) AS U {T1,T2,T3}U {03} is sound and complete for =i. 

Proof. The soundness part is clear. 
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(i) Suppose P and Q are in normal form and P =ous Q- We prove the com- 
pleteness by induction on the sum of the depths of P and Q. Let [Mi]{x)ai[x].Pi 

be a summand of P. Now {[Mi]{x)ai[x].Pi)aMi — ^ PiO'Mi must be matched 
up by QaMi Q such that PiaMi ~ouo Q ■ By Lemma |6l we know that 

either [y\y], PiaMi =ouo Q' or PiOUi =ouo Q' or P^aui =oUo [x\x].Q' . Sup- 
pose PiaMi =oUo [y\y]-Q' is the case. Both PiaMi and [y\y]-Q' are in normal 
form and d{PiaMi) + d{[y\y\.Q') < d{P) -I- d{Q). So by induction hypothesis 
Pi<^Mi = [y\y\-Q' ■ It follows that 

[Mi]{x)ai[x].Pi [Mi]{x)aiaMi[x].PiaM, 

^=' [Mi]{x)aiaMi[x].[y\y].Q' 

= [Mi]{x)aiaMi[x].Q' ■ 

Hence [Mi]{x)ai[x].Pi + Q = [Mi]{x)aiaMi[x]-Q' + Q = Q hy (iii) of Lemma 0 
The situation is the same when P' =ouo Q' or P' =ouo [x\x].Q' . It can be 
similarly proved that [Mi]ai[xi].Pi + Q = Q, respectively [Mi][xi\yi].Pi + Q = Q, 
whenever [Mi]ai[xi].Pi, respectively [Mi][xi\yi].Pi, is a summand of P. Conclude 
that P + Q = Q. Symmetrically P + Q = P. Hence P = Q. 

(iv) Let P and Q be in normal form and P —i Q. Suppose 



biO-jM /z] 






is matched up by QaMi =^’ Qi ‘ ' Q' such that PiaMi ~i Q' ■ By 

Theorem0this is the only situation not covered by the proof of (i). By LemmaEl 
either [x\x\. PiaMi =i Q' or PiaMi =i Q' or PiaMi =i [x\x].Q' . Now suppose 
[x\x\. PiaMi =i Q' is the case. Both [x\x\. PiaMi and Q' are in normal form. So 
by induction hypothesis [x\x\. PiaMi = Q' ■ Therefore 



[Mi]ai[Xi].Pi = [Mi]aiaMi[xiaMi]-PiCrMi 

= [Mi]aiaMi[xiaMi]-Q' ■ 

Thus [Mi]ai[xi].Pi + Q = [Mi]aia Mi[xia Mi]-Q' + Q = Q by (vii) of Lemma0 
The rest of the proof can be safely omitted. □ 



5 Asynchronous Chi Calculus 

In the world of 7r-calculus, attention has been paid to an asynchronous version 
of the language (jEEIQ)- It has been argued that it has the same expressive 
power as the synchronous one. There is also a case for asynchronous y-calculus. 
Merro has recently given a fully abstract translation from the asynchronous % 
to the asynchronous tt (jOj). One hopes to establish that y and tt have the same 
expressive power by relating y to its asynchronous cosine. As one would expect 
the asynchronous y is defined as follows: 



P := 0 I a[a;].P | a[x] \ P\P \ {x)P \ [x=y]P \ P+P 
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The operational semantics remains unchanged. So does the definition of L- 
bisimilarity. Lemma 0 and Theorem Q] still hold. But the bisimulation lattice 
shrinks. 



Theorem 7. o<Z~o=~u=~i=~j- The inclusion is strict. 



Proof. The proofs of the equivalence and the strictness of the in- 
clusion are as before. It remains to show that for all L-bisimilarities 

Now suppose P Q and P P' . Let R be a[a;]-|-(^) such that (j) € L 
and n{(j>) n gn{P\Q) = 0. Then P\R — ^ P'|0. It follows that Q' exists such 

that Q\R — ^ Q|0 and P' Q'- There are two cases: Either Q Q' or 

Q Qi Q' for some fresh z. In the latter case, notice that no action 
is causally dependent on the action az. So all the rest of the actions can be 
performed before the action at that particular a happens. But then the update 



becomes a communication in which a local name is substantiated by x. 
It follows that the delayed action at that particular a must be a[x\. That is 



Q Q 2 Q' can be rearranged as Q 
matchable by Q Q' . 



Q' . So in either case P 



Consequently the bisimulation lattice of the asynchronous x contains only 
two elements. 



6 Asymmetric Chi Calculus 



The simplicity of the lattice £(x) is due essentially to the symmetry of the 
language. If we insist that there is a difference between an action with a positive 
subject name and an action with a negative subject name then an asymmetric 
version of y-calculus results. This is the Update Calculus of Parrow and Victor. 

The operational semantics of the asymmetric y-calculus is defined similarly 
as is for symmetric y. The rule Sqn, Cnd and Sum remain unchanged. Other 
rules are given below: 



P 



P' 



if p, = m(x) then x ^ gn{Q) 
P\qJUp'\Q 



■Cmpo 



p p! 

P\Q P'\Q[y/x_ 



'Cmp^ 



P‘'^^P' y^gnjQ) 
P\Q^£lZ^ P'\Q[y/x] 



p^p, p^p, q"±}q' x^gn{P)^ 

P\Q^P'\Q' P\Q ^ {x){P'\Q') 



P 



p/ Q ™[^J 

P\Q^P'\Q' 



Q' 

— Cmm2 



P\q'^' P' lvMmy/x] 
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P\Q 


P'[ylx\\Q'[y/x] 


Cmm4 


P-^P' x^n{6) 






P^P' x^ 


{x)P {x)P' 


loco 


(^)p LLf P'[y/^] Loci 


{^)P^P' 


P p> 




P pf 


p ( 1^1 p/ 



{x)P ^ P' {y)P P' {x)P ^ (y)P' 



In the rules, /i ranges over {t}U {m[x],m[x],mx,fn{x) \ m,x G Af} and S ranges 
over {t} U {m[x],m[x],rnx,m{x), [y/x], {y/x] \ m,x,y G A/”}. 

The localization operator in asymmetric y-calculus is fundamentally differ- 
ent from the restriction combinator in 7r-calculus. The next example is quite il- 
luminating: (x)([a;=?/]a[a]|6[a;]) ~ in 7r-calculus but (x)([a;=y]a[a]|6[a;]) ^ 

{x)b[x] in the asymmetric y-calculus. In asymmetric y-calculus the effect of the 
consecutive communications 



{z){a[z].{b){b[z]\b[y]))\{x)a[x].Q ^ (6)(0|0)|Q[?//a:] « Q[y/x] 
is the same as the y-communication 

a[y]\{x)^x].Q 0\Q[y/x] « Q[y/x]. 

The asymmetric y-calculus can be investigated in completely the same way as 
the y-calculus has been. The proofs of the corresponding results are more or less 
the same. In this language, we think of an action with positive subject name as 
an input action whereas that with a negative subject name as an output action. 
Let fo denote the set {a[a;] \ a,x G Af} of free output, fi the set {a[x] \ x G Af} 
of free input, i the set {ax | x G A/j of input, ro the set {a{x) | a, a; G Af{ 
of restricted output, u the set {[y/x] \ x,y G Af} of updates and ru the set 
{{y/x\ I x,y G N} of restricted updates. We can define L-bisimilarities 
open bisimilarity and barbed bisimilarity for asymmetric y-processes as we have 
for y-processes. There are altogether 63 L-bisimilarities for the asymmetric y- 
processes. The next theorem conveys all the necessary information to construct 
the bisimulation lattice of the asymmetric y-calculus. 

Theorem 8. The following properties hold for asymmetric x~calculus: 

(i) ^ruufiUu and coincide respectively with open and barbed bisimilarities. 

(a) 

(Hi) ~L^~fo- 

('zuj if fi f] Li = (/) and /z C L 2 . 

(v) if ruD Li = 0 and rw C L 2 . 

(vi) ~Li%~L 2 'If (ju U ro) n Li = 0 and ro C L 2 . 

(vii) if uD Li =0 and m C L 2 . 

(via) If LC\i = % then the inclusion is strict. 

(ix) RdruC^ro- The inclusion is strict. 
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Fig. 4. The Bisimulation Lattice of Asymmetric Chi Processes 



Proof. (i,ii) The proofs are similar to those in ^-calculus. 

(iii) Suppose P Q and P P' . Let R be a[x\+{(j)) such that (j) G L 

and n{(j>) n gn{P\Q) = 0. Then P\R — ^ P'|0. It follows that Q' exists such 
that Q\R Q'\0 and P' Q' . Due to asymmetry, it must be the case that 

(iv) Assume /i n Li = 0 and fi C L 2 - Then {x)a[x].{b){b[z]\b[x]) a[z] + 

(a;)a[a;].(5)(6[z]|5[a;]) but {x)a[x].{b){b[z]\b[x]) o-[z] + {x)a[x].{h){h[z]\f}[x\). 

(v) One has (a)((a;)a[a;]|o[2/])) ^ru 0 but (a)((a;)a[a;] |a[?/])) «Li 0 whenever 
ru n Li = 0. 

(vi) Let Rhe (o:)a[a;].(6)(6[a;]|(?/)6[?/].(w)c[w]). Suppose (ruUro)nLi = 0 and 
ro C L 2 - Then R^Li (cc)a[a;].(w)c[w] + R but R^L 2 (cc)a[a;].(w)c[w] + R. 

(vii) Let R be {b){z)(b[z]\h[x].{a){a[z]\a[y\ + a[y] |a[z])). Suppose uf^Ll = 0 
and u C L 2 - Then R (a)(a[a:] |a[?/]) + R but R (a)(®[2^]|a[y]) + R- 

(viii) By (ii) through (vii) the inclusion is strict whenever LHi = 0. 

(ix) Suppose P ^ru Q and P P' . Then P|a[z] P'|0 for fresh z. 

So Q|a[z] Q'\0 for some Q'. Hence Q Q' . Therefore ^ruQ^ro- The 
strictness follows from (v). □ 



According to Theorem El the L-bisimilarities ~/i, ~ro, and ^ru are 
all pairwise distinct. Using these five relations, the bisimulation lattice of asym- 
metric x-processes can be generated. The pictorial description of the lattice is 
given by the diagram in Fig. El There are altogether 12 elements in the lattice. 
The bottom element is ^ruu fiuu, which characterizes the open bisimilarity. The 
top element is which coincides with the barbed bisimilarity. 



7 The Asynchronous Asymmetric Chi 

It seems reasonable to remove the symmetry of communications in asynchronous 
X-calculus. The resulting language has the same grammar as the asynchronous x 



Fig. 5. The Bisimulation Lattice of Asynchronous Asymmetric Chi Processes 



and the same operational semantics as the asymmetric %. Without further ado, 
we come to describe the bisimulation lattice of the language. 

Theorem 9. In the asynchronous asymmetric x~calculus, one has: 

(i) ^ fi-, ^ru7 

(ii) are pairwise distinct. 

Proof. Apart from the part to establish (vi), the proof of Theorem 0 works 
fine. The only extra work is to prove that for all L-bisimilarities 

Suppose P Q and P P' . Let R be {y)a[y].b[y\ for some fresh b. Then 

P|i? P'| 0 . It follows that Q' exists such that Q\R Q'\0 and P' Q' ■ 

The rest of the argument is similar to that given in the proof of Theorem 0 In 
Q\R Q^|0 only the reduction causally depends on the communication 
involving R. So this particular communication can be delayed to happen just 

before the action involving b. It can then be easily seen that Q Q'. To help 
understand the idea. Let’s see an example. Suppose Q is (a;)(a[a;] |Qi)- Then 



{x){a[x]\Qi)\{y)a[y].b[y] {x){a[x]\Q 2 )\{y)a[y].b[y] 

— » (a:)((0|g2)|&N) 

^ (0|Q3)|0 = Q'|0. 



Clearly (a;)(o[a;] |Qi) (a;)(a[a;] IQ 3 ) ^ 0 IQ 3 = Q'. □ 

The bisimulation lattice of the asynchronous asymmetric y-calculus is pic- 
tured by the diagram in Fig.0 

8 Final Remark 

The y-calculus is meant to be a concurrent generalization of A-calculus. In the 
latter the variables are uniform in the sense that both free and closed variables 
can be instantiated by any term. Free and closed variables become global and 
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local names in our model. In this way the names in %-calculus behave very much 
like logical variables. This is not at all surprising as the calculus is designed with 
proof theory in mind. At the level of operational semantics, the computation- 
as-cut-elimination is a well accepted approach in the world of A-calculus and 
its descendants but has attracted far less attention in process algebra. The x~ 
calculus advocates this approach and invites further study of it. 

The L-bisimilarities are introduced as possible classification of bisimilarities 
on y-processes. In both % and its asymmetric version, open bisimilarity and 
barbed bisimilarity are respectively the bottom and the top elements of the 
bisimulation lattices. Other well-known bisimilarities also fit into the lattice hi- 
erarchy. For instance, ground bisimilarity coincides with «ouo- 

The fact that the barbed bisimilarity on ^-processes is different from the 
obvious bisimilarity called open bisimilarity in this paper was first discovered 
in [m . In present paper the difference is recast in the framework of L-bisimilarity. 
The barbed bisimilarity on ^-processes has some very interesting and unusual 
properties unknown from the study of 7r-processes. Our result implies that weak 
barbed congruence and weak hyperequivalence on Fusion processes are different. 

The operational semantics of % defined in this paper is slightly different from 
that given in Pj. In that paper the reduction 

a[x].P\a[x\.Q P\Q 

is not admissible although 

{x){a[x].P\a[x].Q) {x){P\Q) 

is legal. It should be pointed out that if this restricted communication mechanism 
is adopted, the elements of the bisimulation lattices discussed in this paper would 
proliferate. For instance the bisimulation lattice of asymmetric y-processes would 
have eighteen elements, the reason being that « fo is different from in this 
variant for the same reason that is different from 

In our definition of L-bisimilarities, an update transition P p' is required 

to be matched up by Q Q' such that P' Q' . This is correct because 

we use an early semantics. In late semantics we must replace P' Q' by 
P'[y/x] Q'[y/x]. A simple example suffices to explain the situation. Let R be 

[y|a;].(a;[a;]|y[?/].c[c]) where all names are assumed to be distinct. Then [y|a;].c[c] -I- 

R^l R- In late semantics there is no i?' such that R R' c[c]. 

In a sense, axiomatization is internalization. Usually a meta-operation is 
internalized as choice operator and a meta-judgement is internalized as match 
combinator in calculi of mobile processes. The simplicity of the axiomatization 
systems for y-processes is due to the fact that, unlike in 7r-calculus, one does not 
have the trouble of having to treat localization operator separately by resorting 
to distinction. In y-calculus, if a local name can be ‘opened up’ it is open for all 
instantiations. The work reported in this paper was carried out independently 
with US]. As far as axiomatization is concerned, the novelty of our result is a 
complete system for barbed congruence. We believe that by adding a modified 
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version of the output law 03 to Parrow and Victor’s axiom system for weak 
hyperequivalence we get a complete system for the weak barbed congruence on 
finite Fusion processes. 

An interesting avenue for further investigation is axiomatization of asym- 
metric y-processes. We have already obtained complete systems for some of the 
twelve L-congruence relations. The rest appears more subtle. 
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Abstract. LOTOS is a formal specification language, designed for the 
precise description of open distributed systems and protocols. Our pur- 
pose is to introduce the operators of logics (for example, disjunction, 
conjunction, greatest fixpoint, least fixpoint in ^-calculus) into (basic) 
LOTOS, in order to describe flexible specifications. Disjunction operators 
V have been already proposed for expressing two or more implementa- 
tions in a flexible pecification. In this paper, we propose an extended 
LOTOS with two state operators. They can control recursive behavior, 
in order to express eventuality. The eventuality is useful for liveness prop- 
erties that something good must eventually happen. Then, we present a 
method for checking the consistency of a number of flexible specifications, 
and a method for producing a conjunction specification of them. 



1 Introduction 

The design of large scale distributed systems is known to be a complex task. 
In order to support the design, formal description techniques (FDTs) are used 
for verifying that a realized system conforms to its specification. Process algebra 
such as CCSP21) CSP0, and LOTOS ^ is one of FDTs, and especially LOTOS 
is standardized by ISO. 

In practise, flexible specifications are often given to a system instead of its 
complete specification in the first design step, and the flexible specifications are 
refined step by step, for reducing the number of possible implementations. In 
this case, a flexible specification represents two or more various implementations, 
however a specification described in process algebra usually represents only one 
implementation except equivalent implementations with it. 

In order to describe such flexible specifications, disjunction operators V have 
been proposed by Steen et al. P] for LOTOS and independently by us0 for 
Basic CCS. These operators are similar to a disjunction operator in logic, and 
if Pi is an implementation of a specification Si and P 2 is an implementation of 
a specification S 2 , then the specification Si V S 2 can be implemented by either 
Pi or P 2 , where an implementation is formally an specification expression which 
does not contain disjunction operators (i.e. it is executable). It is important to 
note that non-determinism of CSP can not always play the disjunction instead 
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of V, because specifications can contain non-determinism, such as for gambling 
machines or timeout (see ini)- 

For example, the following specification AB represents implementations which 
can iteratively perform the action a or can stop after the action b. 

AB := a; AB V 6; stop 

where ; is a prefix operator, thus a; AB requires its implementations that they can 
perform a and thereafter conform to the specification AB. The symbol := is used 
for defining the left Constant AB as the right specification, thus it is a recursive 
definition. In this case, the disjunction V is recursively resolved. Therefore, all 
the following implementations satisfy the specification AB. 

Aoo'=a;Aoo, ABq .= b; stop, AB 2 '■= a; a; b; stop 

In the above example, the action b can not be always performed in implementa- 
tions satisfying AB, because Aao satisfies AB. 

Designers often require that something good must eventually happen, namely 
a liveness property. For example, if the above action b must eventually happen, 
then how is AB modified? An answer is to use an infinite disjunction (intuitively, 
like V(n>o) a"; stop), but the infinity complicates integration, verification, et 
al. of flexible specifications. 

In this paper, we propose to use two kinds of stats, called stable states and 
unstable states, in order to express eventuality. Intuitively, disjunction operators 
must be resolved so that a stable state is eventually selected. For example, the 
following specification AB' represents implementations which can perform finite 
a and must eventually stop after b. 

AB' := <ia; AB' V ob; stop 

where < and o are called an un-stabilizer and a stabilizer, and they make an 
unstable stable sate (oa; AB) and a stable state (06; stop), respectively. Thus, 
(<ia; AB) makes it impossible to infinitely select the action a. Consequently, the 
above ABq and AB2 satisfy AB', but A^o does not satisfy AB'. 

The outline of this paper is as follows. In Section |3 we propose an extended 
labelled transition system called /iLTS, by introducing unstable states into the 
ALTS^^. The ALTS is an labelled transition system (LTS) extended by adding 
unlabeled transitions for disjunction operators. Then, we define a specification 
language called ^LOTOS based on the /rLTS. In SectionEl a satisfaction relation 
between an implementation and a specification is defined, and the properties 
of unstable states are shown. In Section El we present a method for checking 
the consistency of a number of specifications, and a method for producing a 
conjunction specification of them. In Section El we discuss related works. In 
Appendix, a table of the notations used in this paper is given. 

2 Definition of Specifications 

In this section, we present a specification language called /rLOTOS for describing 
flexible specifications. In order to concisely explain our main ideas, we will only 
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consider a small subset of the operators of LOTOS in this paper, but it is not 
difficult to introduce the other operators into /iLOTOS. 

In Subsection 12.11 the syntax of ^LOTOS is defined. In Subsection 12.21 a 
/rLTS is given, and then the semantics of ^LOTOS is defined. 



2.1 Syntax 

We assume that a finite set of names JV is given. The set of actions Act is 
defined as Act = Af U {i} and a, /3, • • • are used to range over Act, where * is a 
special action called an internal action {i ^ Af). We give a set of state operators 
'P = {o, <}, where o is called a stabilizer and < is called an un-stabilizer. The set 
W is ranged over by '0, (^, • • •. 

We also assume that a set of specification constants (also called Constants) 
JC and a set of specification variables (also called Variables) X are given. The set 
K, is ranged over by A,B,- ■ •, and the set X is ranged over by X,Y,-- ■. 

Then, the syntax of p-LOTOS is defined. 

Definition 21 We define specification expressions M with the following syntax: 

M ::= A\X \ stop \ ifa; M \ M \\ M \ M\[G]\M | MVM 

where A G K., X G X , ip G 'I' , a € Act, and G C Af. The set of all the 
specification expressions is denoted by A4 and range over A4. The 

operators ; , [| , |[G]|, and V are called a Prefix, a Choice, a Parallel, and a 
Disjunction, respectively. □ 

The difference between the Choice operator [] and the Disjunction opera- 
tor V is intuitively explained as follows. For the Choice, users decide whether 
M'^N behaves like either M or iV at run time, i.e. a dynamic choice. For the 
Disjunction, designers decide whether M V TV is implemented by either M or N 
in specification phase, i.e. a static choice. Thus, Disjunctions are used only in 
specifications and does not remain in implementations. 

The Parallel operator |[G]| of LOTOS synchronizes actions included in G 
and independently performs the other actions. This can synchronize three or 
more specifications. 

We write Var{M) for the set of Variables occurring in the specification ex- 
pression M, and it is inductively defined as follows : 

Var{A) = 0, Var{tfa; M) = Var{M), 

Var(stop) = 0, Var{M op N) = Var{M) U Var{N), 

Var{X) = {V}, 

where op is [] or |[G]| or V. A specification expression M is called a speci/jcafion, 
if it contains no Variables (i.e. Var{M) = 0). The set of specifications is denoted 
by S, and it is ranged over by S,T,U, ■ ■ ■. 

A Constant is a specification whose meaning is given by a defining equation. 
We assume that for every Constant A & JC, there is a defining equation of the 
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form A := S, where S' is a specification which can contain Constants again. 
Thus, it is a recursive definition. We assume that recursion must be guarded by 
Prefixes, such as A := oa; A. For example, we do not consider A := A^ o a. stop. 

The state operators o and o make stable states and unstable states. A stable 
state corresponds to a state in standard LOTOS. If every un-stabilizer < is 
replaced with a stabilizer o, then ^LOTOS is the same as the language of m 
Note that stabilizer o is often omitted. For example, oa; M is written as a; M. 

A specification which neither contains Disjunctions nor un-stabilizers, is 
called a process or an implementation. Thus, the set of processes T" is a sub- 
set of S, and the syntax is defined in terms of the following BNF expression: 

P ::= A I stop \oa;P\P\\P\ P\[G]\P 

where A G JCp C K., a G Act, and G C J\f. We assume that for every Constant 
A G JCp, there is a defining equation of the form A := P, where P G V. The set 
V is ranged over by P, Q, • ' 

In order to avoid too many parentheses, operators have binding power in 
the following order: Prefix > Parallel > Choice > Disjunction. We also use the 
following short notations: 

r stop (C = 0) 

- \M1DM2D ••• = 



Vc = 



F (C = 0) 

MiV M2V ■■■V Mn (C = 



where C is a finite subset of specifications and the relation = represents syntactic 
identity. F is a specification constant defined as follows: 

F := <3i; F 

where i is an internal action. Intuitively, no process satisfies F, because F has 
only one unstable state. On the other hand, a specification constant T which is 
satisfied by all the processes is defined as follows: 

T:=V{E{oa;T : a G A} : A C Act} 

The formal properties of F and T are shown in Proposition in Section 0 



2.2 Semantics 

At first, we propose an extended labelled transition system called /rLTS, by 
introducing unstable states into the ALTSfl]- The ALTS is a labelled transition 
system (LTS) with unlabeled transitions. The difference between the p,LTS and 
the ALTS is that the ^LTS has the set Q of stable states. In other words, if Q 
is the set of all the states, then the p,LTS is the same as the ALTS. 

Definition 22 A ^LTS is a structure (ST, L, 1-^, Q); where ST is a set of 
states, L is a set of labels, ST x L x ST is a set of labelled transitions, 
I— >C ST X ST is a set of unlabelled transitions, and Q G ST is a set of stable 
states. As a notational eonvention, we write s — > s' for (s, a, s') G^ and s ^ s' 
for (s, s') Gi-^. □ 
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In ^LTS, stable states are defined as follows: a state s G ST is stable if and 
only if either s G Q) or s s' for all s\ where s -/-> means that there is no 

pair (a, s') such that s s'. So, a stop s s' is stable, even if s ^ Q- 

The semantics of /rLOTOS is given by the ^LTS (At, Act, — 0)j where 
— > is defined in Definition [21 I— !• is defined in Definition El and O is defined in 
Definition El In this paper, we consider only specification expressions with finite 
states. 

Definition 23 The labelled transition relation — >C At xActxM is the smallest 
relation satisfying the following inference rules. 



Name Hypothesis h Conclusion 



Act 








h 


ipa] M AA 


> M 




Con 


S- 


^S', A 


L := S 
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A^S' 
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M' 
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N - 


N' 
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M|[G]| A' 


Para 
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M', 


N ^ N', 


a e G h 
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M'|[G]| A' 



□ 



Definition 24 The unlabelled transition relation i— > C At x Ad is the smallest 
relation satisfying the following inference rules. 



Name 
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Conclusion 


Actv 
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Stopv 
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stop stop 
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M\[G]\N^ M'|[G]| A' 


Disi 


M ^ M' 
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My N ^ M' 


Disa 


N' 


h 


My N ^ N' 



Definition 25 The set of stable states Q) C M. is the smallest relation satisfying 
the following inference rules. 
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Conclusion 


Acto 








h 
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My N G 0 



268 



Yoshinao Isobe, Yutaka Sato, and Kazuhito Ohmaki 



The rules for labelled transitions — > is exactly same as the rules in standard 
LOTOS, except ip in Act. The state operator does not affect — 

Unstable states can be made from un-stabilizers <, because there is no rule 
for <\a\ M in Definition ISI It is noted that there are stable state M, even if M ^ 
O- For example, the specification S = (<a; S'! |[a, 6 ]| o 6 ; S'2) is stable, because 
S' I— > S" for all S', although S ^ Q. 

Unlabelled transitions are used for resolving disjunction operators, 

as shown in the rules Disi,2- Intuitively, a process P satisfies a specification S, if 
and only if S i— > S' and P satisfies S' for some specification S'. For example, the 
following specification VM of a vending machine can be implemented by either 
{coin; CO f fee; stop) or (coin; tea; stop), 

VM := {coin; coffee; stop) V {coin; tea; stop) 

because VM {coin; cof fee; stop) and VM {coin; tea; stop). 

The definition of i— > is slightly changed from EH. In our definition, all the 
specification can perform unlabelled transitions, and it is not necessary to succes- 
sively perform unlabelled transitions twice. Formally, the following proposition 
holds, where A4q is the set of specification expressions which do not change 
states by unlabelled transitions, thus Ado = {M : {M' : M M'} = {M}}. 

Proposition 21 If M ^ M' , then M' € A4q- 

Proof By induction on the length of the inference of M 1-^ M' . We show only one 
case by Pary, here. By Pary, M 1— > M' implies that for some Mi, M2, M{, M^, 
and G, M = Mi |[G]| M2, M' = M( |[G]| M^, Mi ^ M{, and M2 M^. Thus, 
by induction, M[ G Aio and M^ S Mg- These imply that if M' = M{ |[G]| M^ 
M", then M" = M( |[G]| M^ by Pary. Hence, M' G Aig. □ 

In the rest of this paper. Mg (which is a subset of M) is ranged over by 
Mg,Ng,---. And also, we use Sg to denote the set of Mg which contain no 
Variables, thus 5 o = 5 n Mg, and Sg is ranged over by Sg, Tg, ■ ■ ■. 

3 Satisfaction 

In this section, we define a satisfaction P |= S' of a process P for a specification 
S as an extension of the satisfaction P |=E3) ^ in E 5 - Fhe definition of h=El| 
has been given as follows: the satisfaction |=^] is the largest relation such that, 
P hEl S implies that for some Sg, S 1-^ Sg and for all a G Act the following 
two conditions hold: 

(LE 1 |) if P P' then, for some S', Sg S' and P' |=E3] ' 

(zz.Ell) if ^0 then, for some P' , P P' and P' ■ 

This requires that there exists an Sg which satisfies (z.P!) and {ii. H3)- 
This makes it possible that a specification can be satisfied by two or more vari- 
ous processes. As shown in Proposition Sg can not be resolved any more by 
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I— >, but it may be resolved again after an labelled transition So S' . There- 
fore, the definition of the satisfaction is inductive. For example, the specification 
(a; (5; stop V c; stop) V d; stop) can be implemented by either (a; 6; stop), or 
(a; c; stop), or (a; b; stop [| a; c; stop), or (d; stop). 

In the definition of h[I3|) the specification So can be freely selected from 
{S'o : >5 S'o}- On the other hand, we can control the selection by state operators 

o and <. The key point is that S must eventually reach a stable state. Then, our 
satisfaction is defined as follows. 

Definition 31 A relation TZ C V x S is a satisfaction relation, if {P, S) G TZ 
implies (P,S) G 0(JZ), where 9{TZ) CPxS is inductively defined for any relation 
TZ, as follows: 

• (P, S) G 00) ( 7 ^) iff for some So, S i— > Sq, So G O; <^nd for all a G Act, 

(i) if P P' then, for some S' , Sq S' and {P' , S') G TZ, 

(ii) if So S' then, for some P' , P P' and {P' , S') G TZ, 

• (P, S) G 00+1) ( 7 ^) iff for some So, S Sq and for all a G Act, 

(i) if P P' then, for some {m,S'), So S', {P',S') G 0^'"'\TZ), m < n, 

(a) if So S' then, for some {m,P'), P P' , {P',S') e 9^'^'>{TZ), m<n, 

. (P,5) G 9{TZ) iff{P,S) G 0O)(7^), for some n. □ 



Definition 32 P satisfies S, written P \= S, if (P, S) G TZ, for some satisfac- 
tion relation TZ. (i.e. |= is the relation [J{TZ : TZ is a, satisfaction relation}). 
We use the notation Proc{S) for the set of all the processes which satisfy the 
specification S (i.e. Proc{S) = {P : P 5”}). □ 

The relation |= is the largest satisfaction relation, and we can prove that 
P 1= S' if and only if (P, S) G 9{\=). P \= S requires that S must reach a stable 
state So after finite transitions, where P and S must keep the relation |=. It is 
noted that if P and S i— > So for some So, then (P, S) G 0O)(^), even if 

So ^ Oi because So is stable. For example, (stop,<ia; Si |[o, 6]| o6; S 2 ) G 0O)(|=). 
The following relations between |= and be easily shown. 

— if P ^ S, then P S. 

— if P S and S has only stable states, then P |= S 

The important proposition for the Disjunction V shown in CH holds also for our 
satisfaction. 

Proposition 31 Let S,T G S. Then Proc{S V P) = Proc{S) U Proc{T). 
Proof This is similar to the proof of Proposition 7 in Hg. □ 
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In Subsection 12. 1 1 we defined two special specifications T and F. Two propo- 
sitions for T and F are given: Proposition|^shows that all the processes satisfy 
T and no process satisfies F. Proposition shows the properties for substi- 
tution, where the notation M{N/X} indicates the substitution of N for ev- 
ery occurrence of the Variable V in M, and the notation M is an indexed set 
Ml, • • • , Mn- For example, {5/V} represents {Si/ Xi, S 2 / X 2 , • ■ • , Sn/Xn}, and 
{T/V} represents {T/Vi, T/V 2 , • • • , T/V„}. 

Proposition 32 (1) Proc(T) = S and (2) Proc(F) = 0. 

Proof (1) T has only stable states and any combination of actions {A C Act) can 
be selected by 1 — > from T. (2) F has only one unstable state, because <ii; F ^ Q 
and F has always a transition by i. □ 



Proposition 33 Let M contain Variables X at most. For any S G S, the fol- 
lowing relations hold. 

1. Proc{M{S/X}) C Proc{M{T/X}). 

2. ProclM{F/X}) C Proc\M{S/X}). 

Proof (outline) 

1. We can show that the following 7?. is a satisfaction relation. 

7^={(P,^) :3S,P S,T gTR{S),P GV,S gS}U h 

where TR{S) is the set of all the specifications obtained from the specification 
S by replacing some subexpressions of S' by T. 

2. We can show that the following 7?. is a satisfaction relation. 

7^ = {(P, M{S/X}) : P h M{F/X}, P G P, S G 5}U h 

For this proof, the following property is used : If M{F/V} 1 -^ Tq and P |= Tq) 
then for some Mq, M i— > Mq, Tq = Mo{F/V}, Mq is guarded, and for any 
S, M{S/X} ^ Mq{S/X} □ 

Next, we show examples of P |= S. At first, consider the following process 
PAB and the specification SAB: 



PAB := a; a; b; PAB SAB := <a; SAB V ob; SAB 

In the specification SAB, only {ob; SAB) is stable. Thus, SAB requires that the 
action b must always eventually be performed, although the action a may be 
performed zero or more times before b. In this case, we can show that PAB |= 
SAB, because the following 7?. is a satisfaction relation, 

7^ = {{PAB, SAB), {a; b; PAB, SAB), {b; PAB, SAB)} 

because {PAB, SAB) G 6l(2)(7^), {a; b; PAB , SAB) G 9'^^\TZ), and {b; PAB, SAB) 
G 6»(o)(7^). 
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Fig. 1. The transition graphs of READ, UPDATE, and FILE (o : a stable state) 



Secondly, the following specification FILE is considered. 

FILE := open; OPENED x creat; FILE 
OPENED := <write\ OPENED x oread; OPENED x close; FILE 

where M x N is the short notation defined as follows. 

MxN = M\/N\/{M\\N) 

The specification OPENED requires that the action close must be eventually 
performed, because of the un-stabilizers of <write and <iread. Thus, this speci- 
fication FILE requires that a file must be eventually closed by the action close 
after opened by the action open, and/or that a file can be created by the ac- 
tion creat. The subexpression {<write; OPENED x<read; OPENED) permits that 
actions write and read are inserted after open and before close. For example, 
FILE can be implemented by the following processes READ or UPDATE (i.e. 
READ h FILE and UPDATE \= FILE). 

READ := open; read; close; READ [| creat; READ 
UPDATE := open; read; {close; UPDATE |] write; close; UPDATE) 

The transition graphs of READ, UPDATE, and FILE are shown in Fig.d where 
each circle in FILE means a stable state, and unlabelled transitions which do 
not change states are omitted. 

The un-stabilizers < in OPENED guarantee that the action close must be 
eventually performed. If the un-stabilizer of <iread in OPENED is replaced by 
a stabilizer o, then FILE can be also implemented by the following unexpected 
process READLOOP. 

READLOOP := open; LOOP, where LOOP := read; LOOP 

As another example, a special case /g := <if; Jg V S' is interesting, where i is 
an internal action. This means that zero or more finite internal actions can be 
performed before S. Although the internal action is not distinguished from any 
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other actions in Definition like strong hisini,ilaritii\ 1 2| , it is possible by Is to 
ignore finite internal actions like in weak branching hisimilarity^ for convergent 
specifications (no internal action cycles (p.l48 in m))- 

In the rest of this section, important properties of state operators o and <i 
are shown. At first, we define two subsets and of A4 as follows. 

Definition 33 The specification expressions M in Mi, are defined with the fol- 
lowing syntax: 

M :■= A\X \ stop \ oa;M \M\\M \ M\[G]\M | MVM 

where A G ICi, C K., X € X , a € Act, and G C JV. We assume that for every 
A G there is a defining equation of the form A := P and P G M^- □ 



Definition 34 The specification expressions M in M^ are defined with the fol- 
lowing syntax: 

M ■.:= X \ S \ <a;M \M\\M \ M\[G]\M | MVM 
where S G S , a G Act, and G C J\f. □ 

It is important to note that differences between Mi, and Mfi are not only 
oa; M and M. Every specification in Mi, contains no unstable states, and 
My \s the same as the language in d. On the other hand, M^ can contain the 
specification oa; M, if M contains no Variable, because M^i contains S G S. 

Then, Theorem ^ holds, where the indexed definition A := M{A/X} repre- 
sents Ai := Mi{Ai/ Xx, • • • , AnjXn} for each i G {1, • • ■ , n}. 

Theorem 1. Let AI be guarded by Prefixes and contain Variables X at most, 
and let A := M{A/X}. 

1. Let M G My. P 1 = AIi{A/X} if and only if P \= for any n. 

2. Let M G Mfi- P ^ AIi{A/X} if and only if P \= Mj^"^{F/A} for some n. 

where M^^"^ is the specification expression defined inductively as follows: 

Mf ^ = Xi, = Mi{M<"> /X} 

Proof (outline) 

1. The ‘only if part’ is directly shown by Proposition dl). For the ‘if part’, 
we use another inductive definition rii>o H(i) of |=, where ^(o) = V x 
S and 6*(^(„)). Then we can show that for any n, if P H(n) 

A{M<">/1}{T/A}, then P ^(n) / X}{A/ X}, where N G My and 

N contain Variables X at most. This is not difficult, because N and M have 
no unstable states. Finally, we can set N = Xi, then the result follows. 
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2. The ‘if part’ is directly shown by Proposition EI2). For the ‘only if part’, 
we show that the following 7?. is a satisfaction relation. 

7^ = {(P, A^{M<">/1}{F/X}) : A := M{A/Xj, (F, N{A/X}) € 

n> m,N and M are guarded and contain Variables X at most, 
N G M^,M G h 

The key points are that (1) /X} is still guarded after n transitions, 

(2) N{A/X} must reach a stable state after m! transitions for some m! < m, 
because {P,N{A/ X}) G and (3) if V'{F/V} is stable and N' G 

then N' contains no Variables (i.e. 7V'{F/V} = N' {A/ X}). □ 

Since we consider only finite state specifications. Theorem E shows that if 
M £ Ml, then A is the greatest fixpoint of recursive equations X = M, and if 
M G M^ then A is the least fixpoint of them. For example, if M = a; V V 6; T, 
A := M{A/X}, and P A, then P may not perform b (i.e. may infinitely 
perform a), because M G M,y. On the other hand, if M = <a; X V b;T, A := 
M{A/ X}, and P \= A, then P must eventually perform b, because M G Mfi- 

4 Integration of Specifications 

A number of flexible specifications are sometimes given to a large system instead 
of its complete specification, because many designers work on the same system 
design in parallel, and it is not easy for each designer to know the whole system. 
Such design method decreases responsibility of each designer, but it raises two 
important issues: consistency check of the flexible specifications and integration 
of them. In general, since the integrated specification satisfies all of them, it 
corresponds to a conjunction specification of them. The consistency and the 
conjunction specification are defined as follows. 

Definition 41 Let Si G S. The specifications are consistent with 

each other, if specification S is a conjunction 

specification of Si,- - ■ , Sn, if Proc{S) = rii<i<n Proc(Si) yf 0. □ 

In Subsection 14. II a relation ^ is given for checking the consistency between 
two specifications. In Subsection 14.21 a method called the A-method is given 
for producing a conjunction specification of two specifications. A conjunction 
specification of three or more specifications can be produced by iteratively using 
~ and the A-method. 

4.1 Consistency Check 

In this subsection, we consider the consistency of two specifications. At first, a 
relation ~ is defined as a generalized relation from the satisfaction 

Definition 42 A relation TZ C S x S is a consistent relation, if {S,T) G Ti 
implies (S,T) G 0{TZ), where 0{TZ) C S x S is inductively defined for any 
relation TZ, as follows: 
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• (S,T) G iff for some So and Tq, S Sq, T ^ Tq, Sq gQ, Tq £ Q, 

{i) if So S' then, for some T' , To T' and {S',T') G TZ, 

(ii) if To T' then, for some S', So S' and {S',T') G TZ, 

• (S,T) G 0^'"~^^\TZ) iff for some Sq and To, S i— > So, T i— > To, 

(i) if So S' then, for some {m,T'), Tq T' , {S',T') G 0^™\TZ), m<n, 

(ii) if To T' then, for some {m,S'), So S' , {S' ,T') G 0^'"'\TZ), m < n, 

. (5,^)G0(7^) iff{S,T)£0(''){TZ), for some n. □ 

Definition 43 S ^ T, if {S,T) G TZ for some consistent relation TZ. □ 

The relation 0{TZ) is an extension of 0{TZ) in Definition 1,421 to over S x S, 
and we can show that S' ~ T if and only if (S, T) G 0(~). The relation S ~ T 
requires that S and T must eventually reach stable states at the same time. 

It is important to note that the relation ~ is too strong to check the consis- 
tency between two specifications. For example, the following two specifications 
SAB and SBA {SAB was also used in Section 0) are consistent with each other, 

SAB := <a; SAB V o6; SAB, SBA := oa; SBA V <6; SBA 

because there exist processes P such that P \= SAB and P ^ SBA, for example 
PAP := a; a; 6; PAB. On the other hand, S 'fT, because SAB and SBA can not 
reach stable states at the same time. 

In this paper, we present a method for checking the consistency after trans- 
forming a specification into a standard form. At first, the following set is defined 
in order to define the standard form, where Dri{So) = {5" : 3a, So — S'}. 

Definition 44 LetU C 5. A set V C S is a pre-U set, if S £ V implies that, 

(prel) if S ^ So ^ Q), then Dri{So) C V, 

{pre2) if S So £ O, then Dri{So) U. 

Then, PrefU) = 1J{V ■. V is a pre-U set}. □ 

Then, the standard form is defined as follows, where S' ~ T represents 
Proc{S) = Proc{T). 

Definition 45 A set U C S is a standard set, if S £U implies that, 

(1) if S ^ So ^ O; then for some Sg G O; S i— > Sg ~ Sg, Dri{S'o) C Pre{U), 

(2) if S Sq, then for some Sg, S i— > Sg ~ Sg, Dri{S'o) C U, 

(3) if S I— > Sg, then either Dri{So) QU or Dri{So) Q Pre{U). 

Then, STD = }]{U : U is a standard set}. □ 

Definition 46 Let S £ S. 

1. S is in standard form, if S £ STD. 

2. S is in pre-standard form, if S £ Pre{STD). □ 
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As shown in Definition ESI ii S G STD then for any Sq such that S Sq, 
for some Sq G Q such that S i— > S'q, Proc{So) — Proc(5g). And furthermore, 
for every derivation S' such that Sq S', S' G Pre{STD). This condition (1) 
makes it possible to immediately reach a stable state if S is in standard form, and 
thereafter S' must be in pre-standard form. In order to return to be in standard 
form. S' must eventually reach a stable state. The condition (2) requires that S 
can keep in standard form, and (3) requires that S must keep in either standard 
form or pre-standard form. 

Then, a specification ST(S') produced form S is defined as follows. 
Definition 47 Let S G S. The specification ST(S') is defined as follows. 

SI{S) := \J{STo{So) :S^So}\/ {SSb(^o) : ^ ^ 5o ^ 0} 

PST(^) := \f{STo{So) :S^SoGQ}\/ {PSTo{So) : S ^ So ^ 0} 
SToiSo) = ST(5') : ^ S',St{So) = 

PSToiSo) = E{^a;PST(5') : So ^ S',St(So) = 

55b(^o) = E{oa;PST(5') : So ^ 

where St : So ^ T is a state function defined as : if So G Q) then St{So) = o, 
otherwise St{So) = <. □ 

The specifications ST(S') and PST(S') are Constants. Since we consider only 
specifications S with finite states, the number of states of ST(S') is also finite. 
The key point is that ST(S') contains a stable state SSb(S'o) if S' i— > S'q ^ O- It is 
important to note that the derivation of SSo{Sq) is PST(S') instead of ST(S'). 
In order to return ST, S' must eventually reach a stable state (This is similar to 
the requirement of STD). 

Proposition shows that the set of processes which satisfy S is not changed 
by the transformation ST. And Proposition shows that ST(S) is in standard 
form for any S. Therefore, for any specification S, we can transform S into a 
standard form S' such that S' ~ S" by ST. 

Proposition 41 Let S G S and So G Sq. Then 

S ~ ST(S) ~ PST(S), 

So ~ STo(So) ~ PSTo(So) SSo(So) 

Proof (outline) For S ~ ST(S) ~ PST(S), we can show that the following TZi ^2 
are satisfaction relations. 

TZi = {{P,SI{S)) : Ph^}U{(P,PST(S)) : P ^ S} 

7^2 = {(P,S) : PhST(S)}U{(P,S) : P^PSI{S)} 

The relation Sq — STo(So) — PSTo(So) — SSb(So) can be shown by similar 
satisfaction relations. □ 

Proposition 42 Let S gS. Then ST(S) G STD and PST(S) G Pre{STD). 
Proof (outline) We can show that the following V and U are a pre-U set and a 
standard set, respectively : V = {PST(S) : SgS} and Z// = {ST(S) : SgS}. □ 
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The above examples SAB and SBA are used, again. The specification SAB is 
transformed by ST into the following specification: 

SI{SAB) := <a; SI{SAB) V o6; SI(SAB) V oa; PSI(SAB) 

PSilsAB) := <a; PSI{SAB) V ob; SI(SAB) 

The specification SI{SBA) is symmetrical with SI{SAB) for a and b. The state 
(oa; PST(SAB)) is important, thus SI{SAB) contains a stable state which can 
perform the action a. This implies that SI{SAB) and SI{SBA) can reach stable 
states at the same time. In fact, we can prove that SI{SAB) ~ SI{SBA). 

Now, the relation ~ can be used for checking the consistency of two specifi- 
cations as shown in PropositionE21 The relation ~ can be automatically checked 
by a similar algorithm to one for bisimilarity |0|. 

Proposition 43 Let S,T G STD. Then S T iff Proc{S) n Proc{T) ^ 0. 
Proof (‘if’ part) We show that the following 7?. is a consistent relation. 

7^ = {(5, T) : P 1= 5, P ^ T, S' G STD U Pre{STD), T G STD U Pre{STD)} 

Let P ^ S, P ^ r, S G STD, and T G STD. Since P ^ S, there exists So such 
that S I— > So and P |= Sq. Here, by Definition 23 for some S'q G Q, S Sq and 
P 1= Sg. Similarly, for some Tq G Q, T Tq and P \= Tq. 

For (i), let Sg S'. Since P ^ Sg G So, for some P', P P' and 
P' h S'. Furthermore, since P h U, for some T', ^ T' and P' h T' . 

Here, by Definition 23 S' G STDU Pre{STD) and T' G STDU Pre{STD). Thus, 
{S' ,T') G TZ. For {ii), it is symmetrical. Consequently, (S, T) G 0^^^{TZ). 

For the other cases such that S G Pre{STD) and T G STD, S can reach 
either a state S' G STD or a stop, because P [= S (i.e. S must reach a stable 
state). Hence, these cases can be shown by induction on n of (P, S) G 

(‘only if’ part) Assume that S ~ T. By the definition of there exist Sq 
and To such that Sq ~ To, S Sq, and T ^ Tq. Then, it can be proven that the 
following process CP^"^(S'o, To) satisfies both Sq and To, where n = liSoiTol = 
minjn : (5o,To) G 0^"^(~)}. The detail is omitted. 

CP(")(5o,To) :=E{oo;CP("')(^^,T') :3(y,T'),|^^,T'| = n',n = 0, 

5g HL, ^ , To Hb, T' ^ T', ~ T'} D 

E{oa;CP("')(,5',T') : 3(^ , T'), T'| = n'<n- 1, 

s^al^s'^s'q.TqAl^t'^ti^.s'q^t;,} □ 

By Proposition RTll the above relation SI{SAB) ^ SI{SBA) implies that 
ST(SAB) and ST(SBA) have common processes. Furthermore, this implies SAB 
and SBA have common processes by Proposition ^2 thus they are consistent. 

Proposition 23 shows a method to produce a common process CP^"'^(S'g, Tg) 
of two specifications S and T (i.e. CP^"^(S'o, Tq) G Proc{S) (3 Proc{T)). We can 
also use (S'g, Sq) for producing an executable process P from a specification 

S such that P \= S, where S Sq. 

In the rest of this section, we give a relation = which implies 
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Definition 48 A relation TZ C S x S is a full consistent relation, if {S, T) G TZ 
implies that the following conditions (1) and (2) hold: 

(1) for all So such that S Sq, for some Tq, T i— > Tq, and (t), {ii), (Hi) hold, 

(2) for all To such that T Tq, for some Sq, S i— > Sq, and (i), {ii), (Hi) hold, 

(i) for all a and S', if Sq S' then, for some T' , To~^T', and (S' ,T')gTZ, 

(ii) for all a and T' , if To~^T' then, for some S', So~^S', and (S' ,T')gTZ, 

(Hi) either Sq -f-^ or So gQ) if and only if either To -f-^ or To G Qi □ 

Definition 49 S and T are fully consistent, written S = T, if (S, T) G TZ for 
some full consistent relation TZ. □ 

The condition (Hi) requires that So is stable if and only if To is stable. For 
example, (Si = <ao; S' |] a; S') and (S 2 = <ia; S) are fully consistent, because both 
Si and S 2 are unstable. 

The full consistency is an equivalence relation. And the full consistency im- 
plies that two specifications have the same processes as follows. 

Proposition 44 Let S,T G S. If S = T,then S T (i.e. Proc(S) = Proc(T)). 
Proof It can be shown that the relation {(P,T) : 3S,P \= S, S = T} is a 
satisfaction relation. This proof is not difficult. □ 

The opposite direction of Proposition El (i.e. if S ~ T, then S = T) does 
not always hold. For example, Ai := a; Ai and A 2 := < 0 ; a; A 2 have the same 
processes, but Ai ^ A 2 . The relation ~ is a simple sufficient condition for 

4.2 Conjunction Specification 

In this subsection, we present a method for producing a conjunction specification 
from two specifications. The pair of this method and the relation ~ allows to 
check the consistency of three or more specifications, and to produce a conjunc- 
tion specification of them. 

The key idea for producing conjunction specifications is the standard form 
defined in Definition l4tiL If two specifications are not in standard form, then even- 
tualities of them will be confused with each other. Another important point of our 
method is that non- determinism of Choices [| and Disjunctions V is considered. 
By this non-determinism, each state in a specification can consist with a number 
of various states in another specification. By considering the non-determinism, 
we present the A-method which produces a specification constant SAT from two 
specifications S and T. Since we consider only finite state specifications S and 
T, the number of states of S' A T is also finite. 

Definition 410 Let S,T G S. The specification S AT is defined as follows. 

SAT:= V{SoATo :S^So,T^ro,So~To} 

So A To = V{5' AT' : To ^ T' , S' ~ T'} : So ^ S', St(So) = D 

V{5' A r : So ^ S', S' ^ T'} : To T', St(To) = V'} 



□ 
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The specification SAT performs common actions of S and T, like S' | [G] | T. 
The main difference from S | [G] | T is that SAT keeps the relation ~ between S 
and T. For example, compare the following Sab A Sac and Sab |[a, b, c]| Sac- 

Sab A Sac = a; stop , Sab I [a, b, c] \ Sac = a; stop V stop 

where Sab = a; stop V b; stop and Sac = a; stop V c; stop. Sab A Sac is rightly the 
common specification a; stop of Sab and Sac- On the other hand, Sab |[a, b, c]| Sac 
contains also the specification stop. The stop arises from non-determinism 
of Disjunctions V, for example, by unlabelled transitions Sab ^ 5; stop and 
Sac c; stop, where 6; stop | [a, b, c] \ c; stop The condition Sq ~ Tq in Def- 
inition ^H]l avoids mismatches by non-determinism, such as 6; stop and c;stop. 

Furthermore, by non-determinism of Choices [] , there may exist two or more 
specifications T' such that Tq T', S' ~ T/, and T' 7^ Tj, for each Sq S'. 
For example, if Sq = a; (5; stop V c; stop) and Tq = a; b; stop [] a; c; stop, then 
So S' = b; stop V c; stop, Tq T[ = b; stop, Tq T^ = c; stop, 
S' T[, S' ^ T2, and T[ T^. In this case, a consistent pair such as {S',T') 
can not be uniquely decided. Therefore, all the consistent pairs are flexibly com- 
bined by Disjunctions, as shown in the part \/{S' A T' \ Tq-^T' , S' ~ T'} in 
Definition 14 1 1 )l 

Then, we give an expected proposition for the A-method. 

Proposition 45 Let S,T G STD. If S ^ T , then SAT is a conjunction speci- 
fication of the specifications S and T, thus Proc{S AT) = Proc{S) n Proc(T). 
Proof We can show that the following TZi and TZ2 are satisfaction relations. 

ni = {{P,S):3{So,To),S^ So, So To, P^ So A To} 

7^2 = {{P, U) : 3{So, To), U ^ So ATo,P ^ So,P ^ To, 

(Dri{So),Dri{To) C either STD or Pre{STD))} 

For TZ2, the key point is that So and Tq can reach stable states at the same time, 
because all the derivations of them are in (pre-)standard form. This means that 
So A To can also reach a stable state. The detail is omitted. □ 

The two specifications ST(514B) and ST(SZM) in Subsection 14.11 are used, 
again. By the A-method, the following specifications are produced. 

Gi = SI{SAB) A SI{SBA) := oa; Gi V <5; Gi V oa; G2 V 06; G3 
G2 = FSllsAB) A SllsBA) := <6; Gi V da; C2 V 06; G3 
G3 = SI(SAB) A FSllsBA.) := oa; Gi V oa; C2 V 06; G3 

Then, By Pronosition BTI a.nd Proposition ESI 

Proc{Ci) = Proc{SI{SAB)) n Proc{SI (SAB)) = Proc(SAB) n Proc(SAB). 

In order to reach a stable state from Gi, either (oa; G2) or (06; G3) must be even- 
tually selected. If (oa; G2) is selected, then C2 requires that b must be eventually 
performed. Thus, C\ requires that a and b must be always eventually performed. 

Finally, we give a theorem to produce a conjunction specification of three 
or more specifications by iteratively using ~ and the A-method (and the trans- 
formation by ST (S')). This theorem also shows how to check their consistency. 



Eventuality in LOTOS with a Disjunction Operator 279 



Theorem 2 . Let T\,T2 € STD, T\ he a conjunction specification of Si, - , Sm, 

and T2 he a conjunction specification of Sm+i, ■ ■ ■ ,Sn- 

( 1 ) If Ti ~ T2, then T\ A T2 is a conjunction specification of Si, ■ ■ ■ , Sn- 

(2) IfTi '/'T2, then Si, ■ ■ ■ , Sn are not consistent. 

Proof This is easily proven by Proposition EBI Proposition El and the prop- 
erties of intersection of sets. □ 

5 Conclusion and Related Work 

In this paper, we have considered how to introduce least fixpoint and conjunc- 
tion of u-calciihis |1I7I15| into LOTOS. In order to express the least fixpoint in 
a Labelled Transition System, we have proposed an extended LTS called /iLTS, 
and have defined a language /iLOTOS based on the /iLTS. Then, the A-method 
has been presented for producing a conjunction specification. In general, the con- 
junction specification S is not executable, because it may contain Disjunctions, 
but an executable process can be produced from S by CP in Proposition 

As a related work on flexible specifications, Larsen presented Modal Specifi- 
cations to express loose specifications by required transitions — and allowed 
transitions — ><> in Pj and a language called modal CCS based on the transitions 
in (HlDj . The difference between modal CCS and /rLOTOS is explained by the 
following specifications in modal CCS and S2 in /rLOTOS. 

Si := o*; Si [] stop, S2 := <a; S2 V 6 ; stop, 

where o* represents an allowed action and bn represents a required action (LO- 
TOS syntax is used also for ^i). The following process Pi satisfies both Si and 
S2, while the process P2 satisfies only Si, because the action a must not be in- 
finitely performed in S2, and the process P3 satisfies only S2, because the action 
b can not be postponed in ^i. 

Pi := 6 ; stop, P2 := o; P2 [] 6 ; stop, P3 := a; b; stop 

The basic idea of the /rLTS arose from the notion of divergence El (p-148 
in m) which can avoid infinite loop by internal actions in the notion of diver- 
gence. An unstable state in pLTS is intuitively considered as a state which can 
perform internal actions. But internal actions are needed for expressing dynamic 
behavior such as timeout, and they should not be used for controlling resolution 
of Disjunction operators. Therefore, we have introduced an un-stabilizer <. 

For integration or refinement of specifications, a number of approaches were 
proposed, for example |2| El Brinksma|2| proposed a refined parallel op- 
erator for multiple labels. This operator is used to implement conjunction of 
LOTOS specifications. Steen et al. HSl proposed a conjunction operator O and a 
join operator N in order to yield a common reduction and a common extension, 
respectively, in LOTOS. Larsen et al. HD! defined a conjunction operator A for 
loose specifications in modal CCS. However, these approaches do not consider 
the non-determinism of Disjunction operators V. Therefore, they can not be 
directly applied to /rLOTOS. 



280 



Yoshinao Isobe, Yutaka Sato, and Kazuhito Ohmaki 



For logical requirements, synthesis algorithms of processes were proposed 
in 13 and El. Kimura et al.(3 presented a synthesis algorithm for recursive 
processes by subcalculus of /r-calculus, but the subcalculus does not contain the 
disjunction V. Manna et al. m presented an algorithm for synthesizing a graph 
from requirements described in Propositional Temporal Logic (PTL). In PTL, 
eventualities can be expressed by an operator o, but the synthesized graph from 
PTLs does not always represent all the common processes of them. 
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A Appendix 





Table 1. The notations used in this paper 


Notation 


Meaning 


N 

Act 

If 

K. 

X 

M 

Mo 

Mlfj. 

S 

So 

V 

STD 

Pre{STD) 

Var{M) 

Proc{S) 

Dri(So) 

h 


the set of names a,b, - ■ ■ 

the set of actions q, /3, • ■ (Act = J\f U {i}) 

the set of state operators , (9 = {o, <}) 

the set of specification constants (Constants) A,B,--- 

the set of specification variables (Variables) 

the set of specihcation expressions M, N, - ■ ■ 

the set of specihcation expressions such that if Mo e- > Mq then Mo = Mq 
subsets of M, Definition l.‘-{.'-{la,nri t'-{4l 
the set of specifications S,T,U, - ■ ■ 

the set of specifications such that if So i— > S'o then So = S'o 
the set of processes P,Q, - ■ ■ 

the set of specifications in standard form, Definition I4til 
the set of specifications in pre-standard form 
the set of Variables in M 

the set of all the processes to satisfy S, (Proc(S) = {P : P \= S'}) 
the set of derivations of So (Dri(So) = {S' : 3a, So S'}) 

satisfaction 

a relation for checking consistency 
the relation {(S, T) : Proc(S) = Proc(T)} 
full consistency, Definition 
definition of specification constants 


eiu) 

e(7^) 

St 

SI{S) 


syntactic identity 

a relation to define }=, Definition 1811 
a relation to define ~, Definition 

a state function (if So G O then St(So) = otherwise St(So) = < 
a specification in standard form, transformed from S 
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Abstract. We investigate an automata-theoretic model of distributed 
systems which communicate via message-passing. Each node in the sys- 
tem is a finite-state device. Channels are assumed to be reliable but may 
deliver messages out of order. Hence, each channel is modelled as a set 
of counters, one for each type of message. These counters may not be 
tested for zero. 

Though each node in the network is finite-state, the overall system is 
potentially infinite-state because the counters are unbounded. We work 
in an interleaved setting where the interactions of the system with the 
environment are described as sequences. The behaviour of a system is 
described in terms of the language which it accepts — that is, the set 
of valid interactions with the environment that are permitted by the 
system. 

Our aim is to characterise the class of message-passing systems whose 
behaviour is finite-state. Our main result is that the language accepted by 
a message-passing system is regular if and only if both the language and 
its complement are accepted by message-passing systems. We also exhibit 
an alternative characterisation of regular message-passing languages in 
terms of deterministic automata. 



1 Introduction 

Today, distributed systems which use asynchronous communication are ubiqui- 
tous — the Internet is a prime example. However, there has been very little work 
on studying the finite-state behaviour of such systems. In particular, this area 
lacks a satisfactory automata-theoretic framework. In contrast, automata the- 
ory for systems with synchronous communication is well developed via Zielonka’s 
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asynchronous automata EHa and the connections to Mazurkiewicz trace the- 
ory |M7H) . 

In jM INI H.St)?^ . we introduce networks of message-passing automata as a model 
for distributed systems which communicate via message-passing. Each node in 
the network is a finite-state process. The number of different types of messages 
used by the system is assumed to be finite. This is not unreasonable if we dis- 
tinguish “control” messages from “data” messages. 

In our model, channels may reorder or delay messages, though messages are 
never lost. Since messages may be reordered, the state of each channel can be 
represented by a finite set of counters which record the number of messages of 
each type that have been sent along the channel but are as yet undelivered. 
The nodes cannot test if a counter’s value is zero — this restriction captures the 
intuition that it is not practical for a node to decide that another process has 
not sent a message, since messages may be delayed arbitrarily. 

Though each node in the network is finite-state, the overall system is poten- 
tially infinite-state since counter values are unbounded. Our goal is to charac- 
terise when such a network is “effectively finite-state” . This is important because 
finite-state networks are amenable to verification using automated tools pm . 

To make precise the notion of a network being “effectively finite-state” , we use 
formal language theory. The behaviour of the network is described in terms of its 
interaction with the environment. This can be represented as a formal language 
over a finite alphabet of possible interactions. Our goal then is to characterise 
when the language accepted by a network is regular. 

In , we assume that each node interacts independently with its en- 

vironment. Thus, the behaviour of the overall network is described as a language 
consisting of tuples of strings. The main result is that the language accepted by 
a robust message-passing network, whose behaviour is insensitive to message de- 
lays and differences in speed between nodes, can be “represented” by a sequential 
regular language. 

Here, we adopt an interleaved approach and record the interactions of a 
network with its environment from the point of view of a sequential observer. 
In this framework, it is sufficient to concentrate on the global states of the 
system and regard the entire network as a single automaton equipped with a 
set of counters. Our main result is that a language L accepted by a message- 
passing automaton is regular if and only if the complement of L is also accepted 
by a message-passing automaton. This is more general than requiring that L 
he robust in the sense of — see Section E21 We also demonstrate an 

alternative characterisation in terms of deterministic message-passing automata. 
Along the way, we describe a variety of results about message-passing automata, 
including pumping lemmas which are useful for showing when languages are not 
recognisable by these automata. 

The paper is organised as follows. In the next section we define message- 
passing automata and establish some basic results about them. In Section El we 
prove a Contraction Lemma which leads to the decidability of the emptiness 
problem and the fact that the languages accepted by message-passing automata 
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are not closed under complementation. Section g] describes a family of pumping 
lemmas which are exploited in Section 0 to prove our main results concerning 
the regularity of languages accepted by message-passing automata. In the final 
section, we discuss in detail the connection between our results and those in 
Petri net theory and point out directions for future work. We have had to omit 
many proofs in this extended abstract. Full proofs and related results can be 
found in IMNH,Sfi7IMNI?K^ . 

2 Message-Passing Automata 

Natural numbers and tnples As usual, N denotes the set {0,1,2,...} of 
natural numbers. For i,j G N, [i-.j] denotes the set (i, z-l-1, . . . , jj, where [i-.j] = 

0 if i > j. We compare fc-tuples of natural numbers component- wise. Let m = 

{mi, 1712 , ■ • • , Wfc) and n = {rii, U2, ■ ■ ■ , rik)- Then m < n iff for each 

1 G [l..k]. 

Message-passing antomata A message-passing automaton A is a tuple 
{Q, S, r, T, F), where: 

— Q is a, finite set of states, with initial state qin and accepting states F C Q. 

— A is a finite input alphabet. 

— F is a finite set of counters. We use C,C ,. . . to denote counters. With each 
counter C, we associate two symbols, (7“*" and C~ . We write F^ to denote 
the set {C+\C G Fj U {C~\C G F}. 

— rCQx(AU F^) X <5 is the transition relation. 



Configurations A configuration of A is a pair {q, /) where q G Q and / : F ^ N 
is a function which records the values stored in the counters. If the counters are 
Cl, C 2 , ■ . ■ ,Ck then we represent / by an element (/(Ci), /(C 2 ), . . . , f{Ck)) of 

. By abuse of notation, the A:-tuple (0,0,..., 0) is uniformly denoted 0, for all 
values of k. 

We use X to denote configurations. If x = (?)/)) Qix) denotes q and F(x) 
denotes /. Further, for each counter C, C(x) denotes the value /(C). 

Moves Each move of a message-passing automaton consists of either reading a 
letter from its input or manipulating a counter. Reading from the input repre- 
sents interaction with the environment. Incrementing and decrementing counters 
correspond to sending and reading messages, respectively. 

Formally, a message-passing automaton moves from configuration x to con- 
figuration x^ on d G A U F^ if {Q{x),d,Q{x')) G T and one of the following 
holds: 

- d G A and F(x) = F{x')- 

- d=C+, C(x') = C(x) + 1 and C'(x) = C'(x') for every C' ^ C. 

- d=C~, C(x0 = C(x) - 1 > 0 and C'(x) = C'{x') for every C' ^ C. 
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Such a move is denoted x — that is, transitions are labelled by ele- 
ments of T. Given a sequence of transitions t\t2 . . . = (gi, di, 92)(<72, <^2, 93) ■ • ■ 

{qn,dn,qn+i), the corresponding sequence did2---dn over S U is denoted 
o(tit2 ■ ■ ■ tn). 

Computations, runs and languages A computation of A is a sequence 
Xo Xi Xn- We also write xo Xn to indicate that there is 

a computation labelled t\t2 ■ ■ - tn from xo to x«- Notice that xo and t\t2 ■ ■ - tn 
uniquely determine all the intermediate configurations xi, X2, ■ • ■ , Xn- If the tran- 
sition sequence is not relevant, we just write xo Xn- As usual, x 
denotes that there exists x^ such that x x' and x denotes that there 

exists x^ such that x x' ■ 

For K gN, a, K-run of A is a computation xo Xn where C(xo) < K for 
each C € r. 

If (5 is a string over S U F^, 6 \s denotes the subsequence of letters from 
E in S. Let w = 0102 - - - flfe be a string over E. A run of A over w is a 0 -run 
Xo Xn where Q(xo) = <Zin and a{tit2 - - - tn) fi: = w. The run is said to be 

accepting if Q(xn) € F. The string w is accepted by A if A has an accepting run 
over w. The language accepted by A, denoted L(A), is the set of all strings over 
E accepted by A. 

A language over E is said to be message-passing recognisable if there is a 
message-passing automaton with input alphabet E that accepts this language. 

Example 2 . 1 . Let Lge C {a,b}* be given by | m > n}. This language is 

message-passing recognisable. Here is an automaton for Lge. The initial state is 
indicated by -IJ. and the final states have an extra circle around them. 







The following result is basic to analysing the behaviour of message-passing 
automata. It follows from the fact that any infinite sequence of N-tuples of 
natural numbers contains an infinite increasing subsequence. We omit the proof. 

Lemma 2.2. Let X be a set with M elements and {xi, /i), {x 2 , f 2 ), ■ ■ ■ , {xm, fm) 
be a sequence over X x such that each coordinate of f\ is bounded by K and 
for i G [1..TO— 1], fi and fi+i differ on at most one coordinate and this difference 
is at most 1. There is a constant I which depends only on M , N and K such 
that if m> £, then there exist i,j G [l..m] with i < j, Xi = Xj and fi < fj. 

Weak pumping constant We call the bound I for M, X and K from the 
preceding lemma the weak pumping constant for {M, N, K), denoted ttm,n,k- 
It is easy to see that if (M', N', K) < (M, N, K), then 
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3 A Contraction Lemma 



Lemma 3.1 (Contraction). For every message-passing automaton A, there is 
a eonstant k such that if xo is a computation of A, with m > k, then 

there exist i and j, m—k < i < j < m, such that \o ^ x'm-(j-i) 

a computation of A, with with xfg = Xt for (- G [0..i] and Q{xi) = Qix'e-(j-i)) 
for all £ G 



Proof Sketch. Let A have M states and N counters. We show that k can be 
chosen to be 'Km,n,o- Let xo Xm be a computation of A, with m > ttm,n,o- 

We define a sequence fm, fm-i, • ■ ■ , /o of functions from L to N as follows: 



fm{C) = Ojfor all C G r 

r/,+i(C) ifa(t,+i)^{C+,C-} 

For i G f^{C) = < /i+i(C)+l if a{U+i) = C~ 

y max(0, /j+i(C')-l) if a{U+i) = C+ 



We claim, without proof, that for each i, the function fi represents the minimum 
counter values required to execute the transition sequence ti+iti +2 ■ ■ - tm- 



Claim: Vi G {Q{xi),f) iff / > /,. 



Corollary to Claim: For each counter C and for each position i G 
c{xi) > MC). 

Consider the sequence of iV-tuples fm, fm-i, ■ ■ ■ fo- Since its length exceeds 
by Lemma lT^ there exist positions i and j, rn > j > i > m—TTM,N,o such 
that fj < fi and Q{xj) = QiXi)- By the Corollary to Claim, for each counter C, 

C(xi) > h(C) > fj{C). Thus, x» whereby xo Xi 

x'm-(j-i) ^ valid computation of A for some configuration x'm-(j-i)- ^i^ce 

Q(Xj) = Q(Xi) and the computations Xj 

x'm-(j-i) ai'6 labelled by the same sequence of transitions, it follows that Qfxt) = 
Q(xLo-q) for each i G [j-.m], as required. □ 



Corollary 3.2. A message-passing automaton A with M states and N counters 
has an accepting computation iff it has an accepting computation whose length 
is hounded hy ttm,n,o- 

It is possible to provide an explicit upper bound for ttm,n,k for all values of 
M, TV, and K. This fact, coupled with the preceding observation, yields the fol- 
lowing result (which can also be derived from the decidability of the reachability 
problem for Petri nets). 

Corollary 3.3. The emptiness problem for message-passing automata is decid- 
able. 

Corollary 3.4. Message-passing recognisable languages are not closed under 
complementation. 
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Proof Sketch. We saw earlier that Lge = {a^b^ | m > n} is message-passing 
recognisable. We show that the language Lit = {a"^b^ | m < n} is not message- 
passing recognisable. Suppose that L\t is accepted by an automaton An with 
M states and N counters. Consider the string w = where J = ttm,n,o 

and let p : xo be an accepting run of An on w. By applying the 

Contraction Lemma (repeatedly, if necessary) to p, we can obtain an accepting 
run p' of ^it over a word of the form a'^b^ , where K < J, thus contradicting 
the assumption that L{An) = Ln- □ 



4 A Collection of Pumping Lemmas 

Our main result is based on a series of pumping lemmas, which we present in this 
section. For reasons of space, we do not provide some of the proofs. More details 
may be found in jM N R.SflTIM N R.S'^ . Some of the results of this section were 
used in to show that robust asynchronous protocols are necessarily 

finite state. 

Change vectors For a string w and a symbol x, let (w) denote the number 
of times x occurs in w. Let u be a sequence of transitions. Recall that a(u) 
denotes the corresponding sequence of letters. For each counter C, define Ac{v) 
to be ~ #C- The change vector associated with v, denoted 

Av, is given by (Z\c(u))(^gp. 

Proposition 4.1. Let A = (Q, if, F, T, gin, F) be a message-passing automaton. 

(i) For any computation \ x' of A and any counter C G F, |Z\c(u)| < |u|. 

(a) For any configuration x ond sequence of transitions v, x iff for each 
prefix u of V and each counter C G F, C{x) + Ac{u) > 0. 

(Hi) Let X x' with Q{x) = Qix') ond n € N such that, for every counter 
C G F, either Ac(u) > 0 or C(x) > n\u\ + |u|. Then, x =^- 

Proof. 

(i) This follows from the fact that each move can change a counter value by at 
most 1. 

(ii) This follows immediately from the definition of a computation. 

(iii) The proof is by induction on n. 

Basis: For n = 0, there is nothing to prove. 

Induction step: Let n > 0 and assume the result holds for n— 1. We will show 
that x^X ■ 
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From the assumption, we know that x x'- To show that x' we 

examine the value of each counter C at x'- ^ ^c{u) < 0, then C(x) > 
n\u\ + V. Since C{x') = C{x') + Ac{u) and \Ac{u)\ < |u|, it follows that 
C(x') ^ From the induction hypothesis, we can then conclude 



□ 

Pumpable decomposition Let ^ be a message-passing automaton with N 
counters and let p : xo Xm be a computation of A. A decomposition 

111 111, U2, 112, «3, “Ji, lire, Mre+,1 r ■ • j 

Xo ^ Xu ^ Xji X ^2 =^Xj 2 ^ Xire ^ Xjre ^ Xm of ^ IS said 

to be pumpable if it satisfies the following conditions: 

(i) n < N. 

(ii) For each k € Q{x^k) = QiXjk)- 

(iii) For each Vk, k € [1 ■•«•], Avk is non-zero and has at least one positive entry. 

(iv) Let C be a counter and k G [l..n] such that Ac{vk) is negative. Then, there 
exists £ < k such that Ac{vi) is positive. 



We refer to vi,V 2 , . . ■ as the pumpable bloeks of the decomposition. We say 
that a counter C is pumpable if Ac{vi) > 0 for some pumpable block Vi. The 
following lemma shows that all the pumpable counters of a pumpable decom- 
position are simultaneously unbounded. We omit the proof. (This is similar to 
a well-known result of Karp and Miller in the theory of vector addition sys- 
tems |EM|.) 



Lemma 4.2 (Counter Pumping). Let A be a message-passing automaton 
and p a K-run of A, K G N, with a pumpable decomposition of the form xo 

111, U2 112, lire, lire, Mre + 1 ™ „ 

' - ' - ' - ^ ^ Xj„ ^ Xm- Then, for any 

G N and a K-run p' of A of the 



Xu ^ Xil ^ Xl2 ^ Xj2 • • • Xire = 

/, J G N, with I > 1, there exist £i, £ 2 , ■ ■ ■ , in 



t / 11 1, / "1 , 

form Xo Xi'^ X 

p' satisfies the following properties: 



0i 






^J'2 






Xi 



xfp such that 



(i) Xo = Xo- 

(a) Qixp) = Q(xm)- 

(iii) For i G [l--n], £i > I. 

(iv) For every counter C, C{Xp) ^ C'(xm)- 

(v) Let Fpnmp be the set of pumpable counters in the pumpable decomposition of 
p. For each counter C G Fp^mp, C{x'p) ^ J ■ 



Proof. The proof is by induction on n, the number of pumpable blocks in the 
decomposition . 

Basis: If n = 0, there is nothing to prove. 

Induction step: Let n > 0 and assume the lemma holds for all decompositions 
with n— 1 pumpable blocks. For each counter C, let Jc = max( J, C(xm))- 
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By the induction hypothesis, for all J' S N, /' > 1, we can transform the 

„ 111 ll2, Un-l Ur. r ■ . 

prehx cr : xo Xn Xji ^ Xi„ of p into a iX-run 

^n — 1 

ct' : Xo X*; ^ Xjj ^ ^ Xj-;_^ ^ x'i; satisfying the conditions of 

the lemma. We shall choose I' and J' so that the transition sequence 
can be appended to a' to yield the run claimed by the lemma. 

To fix values for /' and J' , we first estimate the value of in, the number 
of times we need to pump Vn to satisfy all the conditions of the lemma. Let 
^pos = {C I L\c(u„) > 0}. It is sufficient if the number in is large enough for 
each counter C £ to exceed Jc at the end of the new computation. For a 
counter C G to be above Jc at the end of the computation, it is sufficient 
for C to have the value Jc + |un-i-i| after By the induction hypothesis, the 
value of C before is at least C(xi„). Hence, it would take 
iterations of for C to reach the required value after . On the other hand, 
we should also ensure that in > I- Thus, it is safe to set in to be the maximum 
of I and maxcer^Zr. I ■ 

yirri ^ 

We set I' = I and estimate a value for J' such that Xi' "=^ Xp with 
each counter C G {F \ /^"g) achieving a value of at least C{xm) at Xp and each 
counter C G (F^os \ F^os) achieving a value of at least Jc at x'p- 

By the induction hypothesis, Q(Xi' ) = Q(Xi„) and F{x'^, ) > F’(xi„). Since 

Xi„ 5 if follows that Xi' ==^ • By Proposition tt. II (iii), to ensure that 

Xi' Xp, it is sufficient to raise each counter C with Ac{vn) < 0 to a 

value of at least -I- |u„+i| at x^/ • If Ac{vn) < 0 then, by the definition of 

pumpable decompositions, Ac{vi) > 0 for some i G [l..n— 1], so C gets pumped 
above J’ in a' . 

Any counter C such that Ac(vn) > 0 will surely exceed C(xm) at Xp- On 
the other hand, a counter C such that Ac{vn) < 0 can decrease by at most 
in\vn \ + \un+i\ after x['^- 

Putting these two facts together, it suffices to set J' to in\vn \ + |u„+i| -I- 
max{c'|zic(«,.)<o}"^C- 



Let p' : Xo ^ x'; ^ x'; ^ ^ x'; ^ x'; X^- By the 

induction hypothesis, we know that Xo = Xo and for i G [l..n— 1], ii > F. By 
construction, in > I as well. We have also ensured that for every counter C, 
G{Xp) > C'(xm) and for every counter C G FJ,os, C(xP ^ J- The fact that 
QiXp) = QiXm) follows from the fact that each loop brings the automaton 
back to QiXi' ) = Q(Xi„)j and the fact that both p and p' go through the same 
sequence of transitions Un+i at the end of the computation. □ 



Having shown that all pumpable counters of a pumpable decomposition can 
be simultaneously raised to arbitrarily high values, we describe a sufficient con- 
dition for a FX-run to admit a non-trivial pumpable decomposition. 
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Strong pumping constant For each M, N,K G N, we define the strong 
pumping constant IIm,n,k by induction on N as follows (recall that ttm,n,k 
denotes the weak pumping constant for {M, N, K)): 

VM, kgn. nM,o,K = 1 

VM, N,K IIm,n+i,k = nM,N,TTM,N+i,K+K + '^m,n+i.k + K 

Lemma 4.3 (Decomposition). Let A be a message-passing automaton with 
M states and N counters and let K £ N. Let p : xo Xm be any K- 

run of A. Then, there is a pumpable decomposition xo Xh Xji 

Xi 2 Xj 2 ' ' ' Xir, Xjr, Xm of p such that for every counter C, 

if C(xj) > IIm,n,k for some j G [0..m], then there exists k G [l-.n], such that 
Ac{vk) is positive. 

To prove this lemma, we need the following result. 

Proposition 4.4. Let A be a message-passing automaton with M states and N 
eounters and let p : xo Xn be a K-run of A in whieh some counter value 
exeeeds ttm,n,k + K. Then, there is a prefix a : xo Xs of p sueh that: 

— For each m G [0..s] and every counter C , C{xm) < t^m,n,k + K. 

— There exists r G [0..s— 1], such that a : xo =k Xr Xs, QiXr) = QiXs) 
and F{xr) < F{Xs)- 

Proof. Suppose that the lemma does not hold. Let p : xo Xn be a 

computation of minimum length which fails to satisfy the lemma. Since the 

initial counter values in p are bounded by K and some counter value exceeds 
T^M,N,K + if in p, it must be the case that the length of p is at least ttm,n,k- 
By the definition of ttm,n,k, there exist i and j, i < j < t^m,n,k such 
that Q{xi) = QiXj) and F{xi) < F{xj)- Since p is a if-run and j < ttm,n,k, all 
counter values at the configurations Xo,Xi, ■ ■ ■ ,Xj must be bounded by ttm,n,k + 
K. If F{xi) < F{xj), P would satisfy the lemma with r = i and s = j, so it must 
be the case F{xi) = F{Xj)- 

Since Xi = Xj, can construct a shorter computation p' = xo 

Xi Xj+i Xn- It is easy to see that the same counter whose 

value exceeded ttm,n,k + if in p must also exceed ttm,n,k + if in p' — the only 
configurations visited by p which are not visited by p' are those in the inter- 
val Xi+i, Xi+ 2 , ■ ■ - Xj- However, we have already seen that all counter values in 
Xo, Xi, • • ■ , Xj are bounded by ttm,n,k + K. 

It is clear that if p' satisfies the lemma, then so does p. On the other hand, if 
p' does not satisfy the lemma, then p is not a minimum length counterexample 
to the lemma. In either case we obtain a contradiction. □ 

We now return to the proof of the Decomposition Lemma. 

Proof. ( of Lemma[4-.t^ The proof is by induction on N, the number of counters. 
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Basis: If iV = 0, set n = 0 and ui = p. 



Induction step: Let Fgt denote the set of counters whose values exceed IIm,n,k 
in the iC-run p. 



If Fgt = 0, we set n = 0 and ui = p. 



Otherwise, by Proposition H.4I we can find positions r and s in p such that 



Xo Xr Xs Xm, with Qixr) = Q{Xs), F{Xr) < F{xs) and all counter 
values at Xo, Xi> • • • , Xs bounded by ttm,n,k + K. 

Let E be the input alphabet of A and F its set of counters. Fix a counter 
C in which increases strictly between \r and Xs — that is, C'(xs) > C"(xr)- By 
our choice of Xr and Xs, such a counter must exist. Construct an automaton 
A' with input alphabet E U {C'+, C'“} and counters F \ {C}. The states and 
transitions of A' are the same as those of A. In other words. A' behaves like A 
except that it treats moves involving the counter C' as input letters. 



Consider the computation Xs => Xm of A. It is easy to see that there 
is a corresponding computation p' : Xs £qj. 

k G [s..m], Q{xk) = QiXk) fo'' ®ach counter C yf C', C{xk) = C(Xk)- 

From Proposition 14.41 we know that p' is in fact a (7rM,Ar,if+^f)-run of Al . 
Further, for every counter C in \ {C'}, there exists a j G [s..m], such that 
C(Xj) = C'(Xi) > nM,N,K > F[m,n-i,7,m.iv,k+k- (In the iX-run p, no counter 
could have exceeded IIm,n,k before Xs because Proposition 14.41 guarantees that 
all counter values at xo, Xi, • ■ • , Xs bounded by ttm,n,k+K .) By the induction 
hypothesis, we can find a pumpable decompostion 




of p' such that if C is a counter with C{x'j) > FIm,n-i,-km,n k+k for some 
j G [s..to], then there exists k G [l..p] such that Ac{v'i^) is positive. 

Consider the corresponding computation 




of A. In this computation, for each k G [l..p], QiXi'^) = ~ ~ 

QiXjO- Further, for each C G Fgt\{C'}, C(xyJ = C{xpJ and C{xfJ = C(XfJ- 

Hi . . 1 

We prefix the computation Xs Xm with the iX-run xo Xr 

Xs which we used to identify Xs and Xr- We then assert that the composite 
iX-run 




provides the decomposition 
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of p claimed in the statement of the lemma. In other words, ui = u', v\ = v' , 
X^^ = Xr and X]i = Xs, while for k G [2..n], Uk = <_i, Vk = <_i, Xik = 
and Xjk = XiUi- 

Let us verify that this decomposition satisfies all the conditions required by 
the lemma. 

First we verify that this decomposition is pumpable. 

— Since p < N—1, it is clear than n = p+1 < N. 

— By construction = Q{xr) = QiXs) = QiXjJ- For k £ [2..n], Q(x^J = 

Q(xiUi) = = Qixjk)- 

— We know that Z\t>i = Av' is non-zero and strictly positive by the choice of 
v'. For k G [2..n], we know that Ac{vk) = Ac{v'f._i) for C yf C . Since we 
have already established that Av'j^_i is non-zero and has at least one positive 
entry for k G [2..n], it follows that the corresponding change vectors Avk are 
also non-zero and have at least one positive entry. 

— Let C be a counter and k G [l..n] such that Ac{vk) is negative. Since 
Av\ = Av’ is positive by the choice of v, it must be that k G [2..n]. If 
C yf C, then Ac{v'i._-^) = Ac{vk) is negative. In this case, we already know 
that there exists £ G [2../c— 1], such that Z\c(u^_i) = Ac{ve) is positive. 

On the other hand, if C = C , it could be that Aciv'^) is negative for all 
z G [l..p], since C is treated as an input letter rather than as a counter in 
the automaton A' . However, we know that Ac'(vi) = Ac'(v') is positive by 
the choice of v' and C", so C also satisfies the condition of the lemma. 

Finally, let C be a counter such that C{xj) > Um,n,k for some j G [l..m]. 
If C y^ C", then C{xj) > nM,N-i,-KM.N.K+K for some j G [s..m], so we already 
know that Ac{v'f._{) = Ac{vk) is positive for some k G [2..n]. On the other 
hand, if C = C", we know that Ac{v\) = Ac(v') is positive by the choice of v' 
and C. 

□ 



The Counter Pumping Lemma allows us to pump blocks of transitions in 
a computation. However, it is possible for a pumpable block to consist solely 
of invisible transitions which increment and decrement counters. Using the De- 
composition Lemma, we can prove a more traditional kind of pumping lemma, 
stated in terms of input strings. We omit the proof. 

Lemma 4.5 (Visible Pumping). Let L be a message-passing recognisable lan- 
guage. There exists n G N such that for all input strings w, if w G L and 
|w| > n then w can be written as W 1 W 2 W 3 such that \w\W 2 \ < n, \w 2 \ > 1 and 
wiw\wz G L for all i > 1. 

Another consequence of Lemmas 14.21 and 14 . ,41 is a strict hierarchy theorem for 
message-passing automata, whose proof we omit. 

Lemma 4.6 (Counter Hierarchy). For k G N, let Ck be the set of languages 
recognisable by message-passing automata with k counters. Then, for all k, Ck C 

Ck+i ■ 
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5 Regularity of Message-Passing Recognisable Languages 

Automata with Bounded Counters 

Let A = (Q, A, r, T, qi^, F) be a message-passing automaton. For K gN, define 
A[K] = {Q[K],T[K],Q[K]i^, F[K]) to be the finite-state automaton over the 
alphabet S U F^ given by: 

- Q[K] = g X {/ I / : r ^ [0..A]}, with Q[A]i„ = ((7i„,0). 

- F[K]=Fx{f\f:F^[O..K]}. 

- If {q, d, q') G T, then {{q, /), d, {q' , /')) G T[K] where: 

. If d G i 7 , /' = /. 

• lid = C+, f{C') = f{C') for all C' C and 

r(c) = I ^ 

^ ^ A otherwise. 

• If d = C~, f{C') = f{C') for all C' ^ C, /(C) > 1 and 

no = I ^ 

^ ' \ K otherwise. 



Each transition t = {{q, f),di (9^ f)) G T[K] corresponds to a unique transition 
{q,d,q') G T, which we denote For a sequence of transitions tO ■ ■ - tn, we 
write {to ■ ■ ■ tn)~^ for ■ For any sequence tO ■ ■ - tn of transitions 

in T[K], a{tit2-..tn) = a((tit2 • ■ • Moreover, if (go,/o) {qn, f'n) 

and (go, /o) Xn, then Q(xn) = dn- 

Thus, the finite-state automaton A[K] behaves like a message-passing au- 
tomaton except that it deems any counter whose value attains a value K to be 
“full” . Once a counter is declared to be full, it can be decremented as many 
times as desired. The following observations are immediate. 



Proposition 5.1. (i) If {qo, f’o) idiJi) ^ idriJn) « computa- 
tion of A then, (qo,fo) (<71, /i) {dm fn) is a computation of 

A[K] where 



- t'n = (tiO ■■■tn) ^ 

-yCGF.ViG [l..n]. MO = 



f'{C) if f'iO < K for allj<i 
K otherwise 



(a) Let {do,fo) idijfi) ■■■ idnjfn) be a computation of A[K\. 
Then there is a maximal prefix tO ■ ■ ■ti of tO ■ ■ ■tn such that there is a 

computation {do, fo) {di,f[) ■■■ {di, ft) of A with fo = /^. 

Moreover, if £ < n, then for some counter C, a{t'f,j^-f) = C~ , f(0 = 0 ond 
there is a j < £ such that /j(C) = K. 

(Hi) Let L{A[K]) be the language over SLlF^ accepted byA[K]. Let Ls{A[K]) = 
{wl'i; I w G L{A[K])}. Then, L{A) C Ls{A[K]). 
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Synchronised Products of Message-Passing Automata 

Product automata Let Ai = {Qi, Si, Fi), i = 1,2, be a pair of 

message-passing automata. The product automaton Ai x A2 is the structure 



- de {Si U Fi) \ (S2 U F2), (qi,d,q'i) € Ti and q2 = ?2- 

- de (S2U F2) \ (Si U Fi), (q2,d,q'2) S T2 and qi = q{. 

For t = {(qi,q2),d,{q'i,q2)) S T and i G {1,2}, let 7Ti(t) denote {q^,d,q{) 
if d G (Si U Fi) and the empty string e otherwise. As usual, TTi{tit2 ■ ■ -tn) is 
just 'iTi{ti)TTi{t2) ■ . . 7Ti(f„). Thus, for a sequence of transitions p = tit2 ■ ■ - tn over 
T1XT2, TTi(p) and 7T2 (p) denote the projections of p onto the transitions of Ai and 
A2 respectively. Clearly, a{tit2 ■ ■ ■ tn)\{Siuri) = oc{'Ki{tit2 ■ ■ ■ tn)) for z G (1, 2}. 

We shall often write a configuration ((gi,(?2),/) of Ai x A2 as a pair of 
configurations ((91, /i), (92, /2)) of Ai and A2, where fi and /2 are restrictions 
of / to Fi and F2 respectively. 

The following observations are easy consequences of the definition of product 
automata. 



(92, /2) are computations of Ai and A2 respectively. 

(a) If Si = S2 and Ti n l 2 = 0, then L{Ai x A2) = L{Ai) H L{A2). 

5.1 Regularity aud Closure uuder Complemeutatiou 

Our first characterisation of regular message-passing recognisable languages is 
the following. 

Theorem 5.3. Let L be a language over S . L and L are message-passing reeog- 
nisable iff L is regular. 

This result is related to the main result of |IVI IN K.S9R] which states that if we 
record the behaviour of message-passing systems as tuples of sequences, every 
robust system is effectively finite-state. In the sequential setting, a language L 
would be robust in the sense of jIVI IN H,S9^^ if there were a single automaton A 
with accept and reject states such that for each word w € L, every run of A on 
w leads to an accept state and for each word w ^ L, every run of A on w leads 
to a reject state. Here, the requirement on L is much weaker — all we demand 
is that both L and L be accepted independently by message-passing automata, 
possibly with very different state spaces. 

To prove the theorem, we need an auxiliary result. Let L C S* he such that 
both L and its complement L are message-passing recognisable. Let L = L{A) 
and L — L{A), where we may assume that A and A use disjoint sets of counters. 




>.> — ^ X-V // 1 — \ / O TT\\ // /> \ / /. \ \ . 
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The language accepted by ^ x ^ must be empty. Let M be the number of states 
of A X A and N be the number of counters that it uses. Let K he a. number 
greater than Um,n,o, the strong pumping constant for {M,N,0). Recall that 
A[K] = {Q[K],T[K], Q[K]in, F[K]) is a finite-state automaton without counters 
working on the input alphabet E U . 

Lemma 5.4. L{A[K] x A) = 0. 



Proof. Let A[K] = {Q[K],T[K],Q[K]in, F[K]) and xi = (Q, T, T, r,g;„F). Each 
computationp otA[K]xA is of the form ((go, 0), (gg, 0)) ((^i, /i), (?i, /i)) 

• • ■ ^ ((9n,/n),(g„,7n)), where, for i e [0..n], m G T[K] x T. 

By Propositions 15. II and 15. 21 corresponding to the sequence U\U 2 ■ ■ - Un there 
exists a maximal sequence of transitions V\V 2 ■ ■ ■ Vm oi Ax A where: 



— Each Vi belongs to T x T. 

— For each i G [l..m], TT 2 {vi) 

— For each i G [l..m], 7ri('i;i) 



TT2{Ui). 

f (7Ti(Ui))~^ if 7 Ti(Mj) ^ e 
1 e otherwise 



- P' ■■ ((9o,0),(go,0))-^((gi,/(),(gi,/i))^-- fDAtlmJm)) is a 

computation of A x A. 

— If TO < n, then for some C G F, a{um+i) = C~ , f'm{C) = 0 and /j(C') = K 
for some j G [0..to]. 



Let us define the residue length of p to be n—m. 

Suppose that L{A[K] x A) is non-empty. Since L{A x A) is empty, it is easy 
to see that any accepting run of A\K] x A has a non-zero residue length. Without 
loss of generality, assume that the run p considered earlier is an accepting run 
of A[K\ X A whose residue length is minimal. Then, in the corresponding run 
p' of Ax A, the counter C G F attains the value K along p' and then goes 
to 0 at the end of the run so that the move labelled C~ is not enabled at 

{{dm: fm): {Q 771 ■> f m)')' 

Since K exceeds the strong pumping constant for .4 x .4, by Lemma |4. 21 we 

can find an alternative run p' : ((go, 0), (gg, 0)) ^ {(qi, fi), (q£, fi)) with 

{q'nd't) = ^ all other counter values at (ffff) at least 

as large as at (/m,/m)- In particular, every counter which exceeded the cutoff 
value K along p' is pumpable and thus exceeds K along p' as well. 

By Propositions ED and o we can construct a corresponding sequence of 
transitions u'l u '2 ■ ■ - u'^ over T[K]xT such that tti (u( U 2 ... ) = (tti (m(^ U 2 ... ))“ ^ 

and ir 2 {v[v 2 ■ . ■ Ff) = n 2 {u[u 2 ■ ■ ■ u'^), where p : ((go, 0), (gg,0)) 

is a run of A[K] x A with (g^',g^) = (gm,9^) and ffiC) > 
fm{C) for each C G F. 

We already know that /^{C) > fm{C) for each C G F. Further, since every 
counter which exceeded the cutoff value K along p' also exceeds K along p' , we 
know that any counter which has become full along p would also have saturated 
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along p. Thus, we can extend p to an accepting run <j by appending the sequence 
of transitions Um+iUm+2 ■ ■ - Un which occur at the end of the accepting run p. 

Recall that a{um+i) = C~ and fg{C) > 1 by our choice of p' . From this, it 
follows that the residue length of the newly constructed accepting run cr is at 
least one less than the residue length of p, which is a contradiction, since p was 
assumed to be an accepting run of minimal residue length. □ 



We can now prove Theorem 16 . dl 
Proof, (of Thp.orp.m. \h.:-i) 

Let L = L{A) and L = L{A). Define A[K] as above. We claim that Ls{A[K\) = 
L{A). By Proposition 16 . II we know that L{A) C Ls{A\K]). On the other hand, 
from the previous lemma it follows that Ls{A\K])\^L{A) = 0 . This implies that 
Ls{A[K]) C L{A), which means that Ls{A[K]) C L{A). So L{A) = Ls{A[K]). 
Since A\K] is a finite-state automaton, it follows that L{A) is regular. Therefore, 
if a language and its complement are message-passing recognisable then the 
language is regular. 

The converse is obvious. □ 

Observe that our construction is effective — given message-passing automata A 
and A for L and L respectively, we can construct a finite-state automaton A[K] 
for L. 

5.2 Regularity and Determinacy 

Our next characterisation of regularity is in terms of deterministic message- 
passing automata. 

Deterministic Message- Passing Automata A message-passing automaton 
A — {Q, S, r, T, gin, F) is said to be deterministic if the following two conditions 
hold: 

- If (g, di, gi), (g, ^2, 92) G T, with di,d2 G S, then di = ^2 implies gi = q2- 

- If (g, di,gi), (q,d2,q2) G T, with di £ F^, then di = ^2 and gi = q2- 

Though this notion of determinism seems rather strong, any relaxation of 
the definition will allow deterministic automata to simulate non-deterministic 
automata in a trivial manner. If we permit the automaton to choose between 
a counter move and another transition (which may or may not be a counter 
move), we can add a spurious counter C and replace any non-deterministic choice 
between transitions ti = (q,d,qi) and ^2 = (q,d,q2) by a choice between ti and 
a move (g, C+, qc) involving C which leads to a new state where t’2 = {qci d, g2) 
is enabled. 

The following result characterises languages accepted by deterministic mes- 
sage-passing automata. Similar results have been demonstrated for deterministic 
Petri net languages |Pei 8 YIV 82 | . 
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Proposition 5.5. Let A be a deterministic message-passing automaton. Then, 
either L{A) is regular or there is a word w ^ L(A) such that every extension of 
w also does not belong to L{A). 

Proof Sketch. Let A he a. deterministic message-passing automaton. We say that 
A is r -blocked on a word w if, while reading w, A gets stuck at a state with a 
single outgoing edge labelled C~ , for some counter C. If A is T-blocked on w, 
it is easy to see A is also P-blocked on w' for any extension w' of w, so every 
extension of w is outside L{A). 

If A is not P-blocked for any word w, we construct a finite-state automaton 
A! with e-moves over Id with the same state space as A. In A! , each counter 
move {q,d,q'), d G , is replaced by a e-move {q,e,q'). There is a natural 
correspondence between computations of A and runs of A' . It is easy to see that 
L{A) C L{A'). Using the fact that A is not T-blocked for any word w, we can 
show that any accepting run of A! can be mapped back to an accepting run of 
A. Thus L{A') C L{A). In other words, L{A') = L{A), so L{A) is regular. □ 



The preceding characterisation implies that non-deterministic message-passing 
automata are strictly more powerful than deterministic message-passing au- 
tomata. Consider the language L = {w \ w = wia'^b^aw 2 } where W\,W 2 G 
{a, 5}* and m > n > 1. This language is message-passing recognisable but vio- 
lates the condition of Proposition 15. L is not regular and for any word w ^ L, 
we can always find an extension of w in L — for instance, waba G L for all 
w e {a, b}*. 

Observe, however, that even deterministic message-passing automata are 
strictly more powerful than normal finite-state automata. For instance, the lan- 
guage Lge of Example 12. II is not regular but the automaton accepting the lan- 
guage is deterministic. 

With these observations about deterministic automata behind us, we can now 
state our alternative characterisation of regularity. First, we need some notation. 
For a string w, let denote the string obtained by reading w from right to 
left. For a language L, let be the set of strings {w^ \ w G L}. 

Theorem 5.6. Let L be a message-passing recognisable language such that L^ 
is recognised by a deterministic message-passing automaton. Then, L is regular. 

The idea is to compute a constant r which depends on the number of states 
M and the number of counters of .A and to then show that L is recognised 
by the finite-state automaton L[t\. The proof is quite involved and we omit it 
due to lack of space. 

6 Discussion 

Other models for asynchronous commmunication Many earlier attempts 
to model asynchronous systems focus on the infinite-state case — for instance, the 
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port automaton model of Panangaden and Stark | |PS88| and the I/O automaton 
model of Lynch and Tuttle inrsTi - Also, earlier work has looked at issues far 
removed from those which are traditionally considered in the study of finite-state 
systems. 

Recently, Abdulla and Jonsson have studied decision problems for distributed 
systems with asynchronous communication However, they model chan- 

nels as unbounded, fifo buffers, a framework in which most interesting questions 
become undecidable. The results of show that the fifo model becomes 

tractable if messages may be lost in transit: questions such as reachability of 
configurations become decidable. While their results are, in general, incompara- 
ble to ours, we remark that their positive results hold for our model as well. 



Petri net languages Our model is closely related to Petri nets [I078l.l8hj . We 
can go back and forth between labelled Petri nets and message-passing networks 
while maintaining a bijection between the firing sequences of a net N and the 
computations of the corresponding automaton A. 

There are several ways to associate a language with a Petri net |H75I.T86fPet81] . 
The first is to examine all firing sequences of the net. The second is to look at 
firing sequences which lead to a set of final markings. The third is to identify 
firing sequences which reach markings which dominate some final marking. The 
third class corresponds to message-passing recognisable languages. 

A number of positive results have been established for the first class of 
languages — for instance, regularity is decidable |(;Y80IVV80j . On the other 
hand, a number of negative results have been established for the second class of 
languages — for instance, it is undecidable whether such a language contains all 
strings mm- However, none of these results, positive or negative, carry over 
to the third class — ours is one of the few tangible results for this class of Petri 
net languages. 



Directions for future work We believe that Theorem 15. hi holds even with- 
out the determinacy requirement on . An interesting question is to develop a 
method for transforming sequential specifications in terms of message-passing au- 
tomata into equivalent distributed specifications in terms of the message-passing 
network model of [IVLIN P.ShR] . Another challenging question is the decidability of 
regularity for message-passing recognisable languages. 
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Abstract. This paper provides a brief overview of a tutorial on current 
research directions in mobile computation. In particular we provide an 
overview of the various abstract calculi for mobility, an overview of the 
programming languages that have been developed for mobile computa- 
tion, and provide a comparative study of the language abstractions via 
the calculi and discuss the underlying challenges. 



1 Introduction 

Mobility denotes the physical or virtual movement of computational resources 
(comprising hardware and software) across a local or global network. The term 
mobile computing is used for environments where the hardware is mobile, while 
the term mobile computation is used in situations where software is mobile. In 
this survey we shall focus on mobile computation. With rapid advances in Inter- 
net technology, and the availability of powerful laptop and palmtop computers, 
the concept of mobility is undergoing a metamorphosis from a mere theoret- 
ical curiosity to become one of the essential requirements of every computing 
environment. The phenomenon of mobility on a global scale has the potential 
of causing wide ranging effects on the practice of computing and programming. 
Further, by providing a completely new paradigm for understanding the nature 
of computation, it forces us to reconsider our conceptual understanding of the 
notion of computation as well. 

In this tutorial, we shall provide an overview of the calculi for mobility, an 
overview of the programming languages that have been developed for mobile 
computation, and provide a comparative study of the language abstractions via 
the calculi and discuss the underlying challenges. Following sections provide a 
brief overview of the various calculi for mobility & their evolution, and the widely 
used programming languages for mobile computation purposes. 

2 Calculi for Mobility 

What is an appropriate abstract model for mobile computation? Traditional 
research has focussed along three independent strands - functional, concurrent. 
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and distributed computing. However none of these models adequately captures 
the significant features of mobile computation j2|. Mobile computation demands 
an entirely new paradigm. 

A number of calculi have been proposed over the years to study concurrent 
computation. New calculi are being evolved from these, with the addition of 
notions to capture features associated with mobile computation. 

One of the earliest abstract framework for concurrent computing was pro- 
vided by Actors P). Though the framework had an intuitive appeal to model 
massively parallel computation, it was not amenable to formal reasoning. Recent 
proposals have sought to rectify this situation with notions of nested structures 
of arbitrary depth, which mirrors the nesting of administrative domains on the 
web PS|- The earliest proposals of formal calculi for concurrent computing could 
deal only with static connectivity among processes, and in certain cases even the 
total number of processes had to be fixed in advance. Among these are Petrinets 
P3|) CSP PBIj and CCS PJ. In the setting of the Internet, communication links 
appear and disappear frequently. This is due to wide fluctuations in bandwidth 
of links, unpredictable failures and recovery of links giving rise to connections 
through alternative routes. So calculi which are restricted to static connectiv- 
ity are clearly inadequate. The next generation of process calculi could model 
networks of processes with dynamic connectivity. Among these 7r-calculus m, 
and Asynchronous 7r-calculus im directly modeled channel mobility, while Linda 
and CHOCS modeled process mobility. These calculi lack the notion of 
distributed locations of processes, which cannot be ignored in mobile compu- 
tation over the web. These calculi have in turn been extended with primitives 
to capture distributed concurrent computing. LLinda m extends Linda, while 
Join-calculus m extends 7r-calculus. The next logical step towards developing 
a calculus for mobile computation was to incorporate explicit notions of local- 
ity in these calculi The notion of locations is important in modeling barriers 
caused due to administrative domains which impose their own restrictions on the 
movement of messages across them. Such extensions gave rise to the Distributed 
Join-calculus m and the Ambient calculus m- The Ambient calculus can also 
model mobility of active computations. 

Apart from these calculi, a number of others have been proposed to study 
various orthogonal issues in mobile computation. The Spi calculus [Q is a pre- 
liminary attempt to model security aspects, while Mobile Unity m incorporates 
techniques for reasoning and specification of mobile computations. 

There are a number of issues that need further exploration, such as refined 
notions of structured data, typing, and correctness proofs in a concurrent dis- 
tributed mobile context. These notions need to be unified in a coherent semantic 
framework. Such an enterprise will lead to fundamental insights into the phe- 
nomenon of computation itself. 
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3 Programming Languages for Mobile Computation 

One of the most fascinating outcomes of the development of mobile computation 
is the concrete realization of mobile software. Mobile software is programming 
language code which is shipped across a network from a source to a destination 
site, and is executed at the destination computer. The results (if any) are then 
returned to the source computer. The presence of mobile software in a frame- 
work where the source and destination computers are themselves mobile (due to 
hardware mobility) gives rise to a rich set of possibilities, and also gives raise to 
a number of issues that need to be addressed. 

The desirability of mobile software arises primarily from the fact that it 
can vastly reduce the communication overheads while implementing services be- 
tween servers and clients across a network. The savings are particularly enhanced 
if the service happens to be an interactive one. Reduction in the messages ex- 
changed between server and client is particularly meaningful in an environment 
where communication links experience unpredictable fluctuations in their effec- 
tive bandwidth, are susceptible to frequent failures, and the messages have to 
traverse a number of distinct administrative domains which erect barriers to 
the flow of messages across them. Secondly, mobile software provides a great 
degree of flexibility to the server computers, wherein the kind and number of 
services offered by them need not be static and determined in advance. Third, 
the maintenance of networks of mobile computers is rendered easier when the 
clients can themselves control the installation of latest software updates as and 
when required. Finally, as a valuable side effect, mobile software also leads to 
savings in storage space since programs can be fetched and executed on demand. 

The possibility of mobile software gives rise to at least three kinds of situa- 
tions depending on the entity which takes the initiative of moving the software 
- the provider, the execution site, or the software itself. 

A number of issues crop up in the design and implementation of mobile soft- 
ware. First among these is the obvious requirement of portability. The same 
software should be capable of being executed across a variety of architectures 
and computing environments. The second requirement is that of safety. Safety 
demands that errors in the mobile software should not affect the environment 
of the execution site. Third, the myriad security issues such as secrecy, mes- 
sage integrity, authentication of servers and clients, and non-repudiation of mes- 
sages transmitted assume great significance. Finally, the need for efficiency is 
omnipresent. Various programming languages have been designed for mobile 
software including those for agent-oriented programming. A few prominent ones 
among these are Obliq (Sj, Telescript ^S|, Limbo POj, O’Caml PH], Safe-Tcl |21|, 
and Java ^]. A comparative study of these languages is contained in M- 

In spite of the availability of a number of languages for developing mobile 
software, there remains a wide scope for further improvements. For instance, 
transfer of active computations needs to be supported [Z|- In the context of 
mobility, it is not sufficient to allow only the transfer of passive program code 
across computers; effective and efficient ways of transferring computation in 
progress, along with their environments (and proofs!) needs to be given careful 
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consideration. The development of such programming paradigms could provide 
a positive feedback to the architecture of the Internet itself. 

4 Discussion 

The phenomenon of mobility represents a major revolution in the discipline 
of computing and programming. Research on calculi for mobility is seeking to 
address the challenges arising from mobile computation by unifying various tra- 
ditional computation models; while research on programming languages seeks 
to create tools which will translate theoretical possibilities to realistic systems 
which would comprise computational environments in the decades to come. 

In order to fully translate the potential benefits of mobility to a computer- 
ized society, research is needed on a whole spectrum of issues ranging from the 
formulation of a formal logical basis for computation in a mobile world, to the 
development of cost-effective and user friendly software tools which translate 
theoretical possibilities to practical and realistic computational environments. 

The scenario of autonomous entities freely traversing globally distributed 
networks, requires the development of entirely new proof systems, behavioral 
theories, and software development methodologies in order to specify and estab- 
lish properties of such computational entities 123!. Software tools which comprise 
implementations of verification systems, and specification languages would result 
on the basis provided by such proof methodologies. 

The area of network security acquires a prominent place when mobility is 
prevalent. Traditional research on security has focussed mostly on secrecy issues, 
which have been handled through cryptology. More recent work on security has 
dealt with ways of reasoning about authentication mechanisms using logics of 
belief 0. However, other aspects of security in mobile computation are yet to 
be addressed in a satisfactory manner H2j. 
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